TTS models
This document lists all text-to-speech models supported in sherpa-onnx.
Monolingual
The following table lists models by languages.
Mixed-lingual
The following lists models supporting multiple languages.
Chinese+English
This section lists text to speech models for Chinese+English.
matcha-icefall-zh-en
| Info about this model | Download the model | HF Space | Android APK | Python API |
| C API | C++ API | Rust API | Node.js API | Dart API |
| Swift API | C# API | Kotlin API | Java API | Pascal API |
| Go API | Samples |
Info about this model
This model is trained using the code modified from https://github.com/k2-fsa/icefall/tree/master/egs/baker_zh/TTS/matcha
It is from https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010
It supports Chinese and English.
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
You need to download the acoustic model and the vocoder model.
Download the acoustic model
Please use the following code to download the model:
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-en.tar.bz2
tar xvf matcha-icefall-zh-en.tar.bz2
rm matcha-icefall-zh-en.tar.bz2
You should see the following output:
ls -lh matcha-icefall-zh-en/
total 168432
-rw-r--r--@ 1 fangjun staff 58K 4 Dec 14:29 date-zh.fst
drwxr-xr-x@ 122 fangjun staff 3.8K 28 Nov 2023 espeak-ng-data
-rw-r--r--@ 1 fangjun staff 1.3M 4 Dec 14:29 lexicon.txt
-rw-r--r--@ 1 fangjun staff 72M 4 Dec 14:29 model-steps-3.onnx
-rw-r--r--@ 1 fangjun staff 63K 4 Dec 14:29 number-zh.fst
-rw-r--r--@ 1 fangjun staff 87K 4 Dec 14:29 phone-zh.fst
-rw-r--r--@ 1 fangjun staff 2.0K 4 Dec 14:29 README.md
-rw-r--r--@ 1 fangjun staff 21K 4 Dec 14:29 tokens.txt
Download the vocoder model
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-16khz-univ.onnx
You should see the following output
ls -lh vocos-16khz-univ.onnx
-rw-r--r--@ 1 fangjun staff 51M 4 Dec 14:54 vocos-16khz-univ.onnx
Huggingface space
You can try this model by visiting https://huggingface.co/spaces/k2-fsa/text-to-speech
Huggingface space (WebAssembly, wasm)
You can try this model by visiting
https://huggingface.co/spaces/k2-fsa/web-assembly-zh-en-tts-matcha
The source code is available at https://github.com/k2-fsa/sherpa-onnx/tree/master/wasm/tts
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
The following code shows how to use the Python API of sherpa-onnx with this model.
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(
acoustic_model="matcha-icefall-zh-en/model-steps-3.onnx",
vocoder="vocos-16khz-univ.onnx",
lexicon="matcha-icefall-zh-en/lexicon.txt",
tokens="matcha-icefall-zh-en/tokens.txt",
data_dir="matcha-icefall-zh-en/espeak-ng-data",
),
num_threads=2,
debug=True, # set it False to disable debug output
),
max_num_sentences=1,
rule_fsts="matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst",
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。"
audio = tts.generate(text, sid=0, speed=1.0)
sf.write(
"./test.mp3",
audio.samples,
samplerate=audio.sample_rate,
)
You can save it as test_zh_en.py and then run:
pip install sherpa-onnx soundfile
python3 ./test_zh_en.py
You will get a file test.mp3 in the end.
C API
Click to expand
You can use the following code to play with matcha-icefall-zh-en using C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.matcha.acoustic_model = "matcha-icefall-zh-en/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-16khz-univ.onnx";
config.model.matcha.lexicon = "matcha-icefall-zh-en/lexicon.txt";
config.model.matcha.tokens = "matcha-icefall-zh-en/tokens.txt";
config.model.matcha.data_dir = "matcha-icefall-zh-en/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-zh-en.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-zh-en \
/tmp/test-zh-en.c
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh-en
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-zh-en.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with matcha-icefall-zh-en using C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.matcha.acoustic_model = "matcha-icefall-zh-en/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-16khz-univ.onnx";
config.model.matcha.lexicon = "matcha-icefall-zh-en/lexicon.txt";
config.model.matcha.tokens = "matcha-icefall-zh-en/tokens.txt";
config.model.matcha.data_dir = "matcha-icefall-zh-en/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst";
std::string filename = "./test.wav";
std::string text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-zh-en.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-zh-en \
/tmp/test-zh-en.cc
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh-en
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-zh-en.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
matcha: OfflineTtsMatchaModelConfig {
acoustic_model: Some("matcha-icefall-zh-en/model-steps-3.onnx".into()),
vocoder: Some("vocos-16khz-univ.onnx".into()),
tokens: Some("matcha-icefall-zh-en/tokens.txt".into()),
data_dir: Some("matcha-icefall-zh-en/espeak-ng-data".into()),
lexicon: Some("matcha-icefall-zh-en/lexicon.txt".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
rule_fsts: Some("matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst".into()),
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with matcha-icefall-zh-en with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
matcha: {
acousticModel: 'matcha-icefall-zh-en/model-steps-3.onnx',
vocoder: 'vocos-16khz-univ.onnx',
tokens: 'matcha-icefall-zh-en/tokens.txt',
dataDir: 'matcha-icefall-zh-en/espeak-ng-data',
lexicon: 'matcha-icefall-zh-en/lexicon.txt',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
ruleFsts: 'matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst',
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = '我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(
acousticModel: 'matcha-icefall-zh-en/model-steps-3.onnx',
vocoder: 'vocos-16khz-univ.onnx',
tokens: 'matcha-icefall-zh-en/tokens.txt',
dataDir: 'matcha-icefall-zh-en/espeak-ng-data',
lexicon: 'matcha-icefall-zh-en/lexicon.txt',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
matcha: matcha,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: '我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with Swift API.
func run() {
let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(
acousticModel: "matcha-icefall-zh-en/model-steps-3.onnx",
vocoder: "vocos-16khz-univ.onnx",
tokens: "matcha-icefall-zh-en/tokens.txt",
dataDir: "matcha-icefall-zh-en/espeak-ng-data",
lexicon: "matcha-icefall-zh-en/lexicon.txt"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Matcha.AcousticModel = "matcha-icefall-zh-en/model-steps-3.onnx";
config.Model.Matcha.Vocoder = "vocos-16khz-univ.onnx";
config.Model.Matcha.Tokens = "matcha-icefall-zh-en/tokens.txt";
config.Model.Matcha.DataDir = "matcha-icefall-zh-en/espeak-ng-data";
config.Model.Matcha.Lexicon = "matcha-icefall-zh-en/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
matcha = OfflineTtsMatchaModelConfig(
acousticModel = "matcha-icefall-zh-en/model-steps-3.onnx",
vocoder = "vocos-16khz-univ.onnx",
tokens = "matcha-icefall-zh-en/tokens.txt",
dataDir = "matcha-icefall-zh-en/espeak-ng-data",
lexicon = "matcha-icefall-zh-en/lexicon.txt",
),
numThreads = 1,
debug = true,
),
ruleFsts = "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst",
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var matcha = new OfflineTtsMatchaModelConfig();
matcha.setAcousticModel("matcha-icefall-zh-en/model-steps-3.onnx");
matcha.setVocoder("vocos-16khz-univ.onnx");
matcha.setTokens("matcha-icefall-zh-en/tokens.txt");
matcha.setDataDir("matcha-icefall-zh-en/espeak-ng-data");
matcha.setLexicon("matcha-icefall-zh-en/lexicon.txt");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setMatcha(matcha);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setRuleFsts("matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst");
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with Pascal API.
program test_matcha;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Matcha.AcousticModel := 'matcha-icefall-zh-en/model-steps-3.onnx';
Config.Model.Matcha.Vocoder := 'vocos-16khz-univ.onnx';
Config.Model.Matcha.Tokens := 'matcha-icefall-zh-en/tokens.txt';
Config.Model.Matcha.DataDir := 'matcha-icefall-zh-en/espeak-ng-data';
Config.Model.Matcha.Lexicon := 'matcha-icefall-zh-en/lexicon.txt';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.RuleFsts := 'matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst';
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with matcha-icefall-zh-en with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Matcha: sherpa.OfflineTtsMatchaModelConfig{
AcousticModel: "matcha-icefall-zh-en/model-steps-3.onnx",
Vocoder: "vocos-16khz-univ.onnx",
Tokens: "matcha-icefall-zh-en/tokens.txt",
DataDir: "matcha-icefall-zh-en/espeak-ng-data",
Lexicon: "matcha-icefall-zh-en/lexicon.txt",
},
NumThreads: 1,
Debug: true,
},
RuleFsts: "matcha-icefall-zh-en/phone-zh.fst,matcha-icefall-zh-en/date-zh.fst,matcha-icefall-zh-en/number-zh.fst",
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
我最近在学习machine learning,希望能够在未来的artificial intelligence领域有所建树。在这次vocation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号,拨打110或者189202512043。123456块钱。在这个快速发展的时代,人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一,让机器能够用自然流畅的语音与人类进行交流。
sample audios for different speakers are listed below:
Speaker 0
kokoro-multi-lang-v1_0
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kokoro v1.0 and it is from https://huggingface.co/hexgrad/Kokoro-82M
It supports both Chinese and English.
| Number of speakers | Sample rate |
|---|---|
| 53 | 24000 |
Meaning of speaker prefix
| Prefix | Meaning | sid range | Number of speakers |
|---|---|---|---|
| af | American female | 0 - 10 | 11 |
| am | American male | 11 - 19 | 9 |
| bf | British female | 20 - 23 | 4 |
| bm | British male | 24 - 27 | 4 |
| ef | Spanish female | 28 | 1 |
| em | Spanish male | 29 | 1 |
| ff | French female | 30 | 1 |
| hf | Hindi female | 31 - 32 | 2 |
| hm | Hindi male | 33 - 34 | 2 |
| if | Italian female | 35 | 1 |
| im | Italian male | 36 | 1 |
| jf | Japanese female | 37 - 40 | 4 |
| jm | Japanese male | 41 | 1 |
| pf | Brazilian Portuguese female | 42 | 1 |
| pm | Brazilian Portuguese male | 43 - 44 | 2 |
| zf | Chinese female | 45 - 48 | 4 |
| zm | Chinese male | 49 - 52 | 4 |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 3 | 0 -> af_alloy | 1 -> af_aoede | 2 -> af_bella | 3 -> af_heart |
| 4 - 7 | 4 -> af_jessica | 5 -> af_kore | 6 -> af_nicole | 7 -> af_nova |
| 8 - 11 | 8 -> af_river | 9 -> af_sarah | 10 -> af_sky | 11 -> am_adam |
| 12 - 15 | 12 -> am_echo | 13 -> am_eric | 14 -> am_fenrir | 15 -> am_liam |
| 16 - 19 | 16 -> am_michael | 17 -> am_onyx | 18 -> am_puck | 19 -> am_santa |
| 20 - 23 | 20 -> bf_alice | 21 -> bf_emma | 22 -> bf_isabella | 23 -> bf_lily |
| 24 - 27 | 24 -> bm_daniel | 25 -> bm_fable | 26 -> bm_george | 27 -> bm_lewis |
| 28 - 31 | 28 -> ef_dora | 29 -> em_alex | 30 -> ff_siwis | 31 -> hf_alpha |
| 32 - 35 | 32 -> hf_beta | 33 -> hm_omega | 34 -> hm_psi | 35 -> if_sara |
| 36 - 39 | 36 -> im_nicola | 37 -> jf_alpha | 38 -> jf_gongitsune | 39 -> jf_nezumi |
| 40 - 43 | 40 -> jf_tebukuro | 41 -> jm_kumo | 42 -> pf_dora | 43 -> pm_alex |
| 44 - 47 | 44 -> pm_santa | 45 -> zf_xiaobei | 46 -> zf_xiaoni | 47 -> zf_xiaoxiao |
| 48 - 51 | 48 -> zf_xiaoyi | 49 -> zm_yunjian | 50 -> zm_yunxi | 51 -> zm_yunxia |
| 52 | 52 -> zm_yunyang |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 3 | af_alloy -> 0 | af_aoede -> 1 | af_bella -> 2 | af_heart -> 3 |
| 4 - 7 | af_jessica -> 4 | af_kore -> 5 | af_nicole -> 6 | af_nova -> 7 |
| 8 - 11 | af_river -> 8 | af_sarah -> 9 | af_sky -> 10 | am_adam -> 11 |
| 12 - 15 | am_echo -> 12 | am_eric -> 13 | am_fenrir -> 14 | am_liam -> 15 |
| 16 - 19 | am_michael -> 16 | am_onyx -> 17 | am_puck -> 18 | am_santa -> 19 |
| 20 - 23 | bf_alice -> 20 | bf_emma -> 21 | bf_isabella -> 22 | bf_lily -> 23 |
| 24 - 27 | bm_daniel -> 24 | bm_fable -> 25 | bm_george -> 26 | bm_lewis -> 27 |
| 28 - 31 | ef_dora -> 28 | em_alex -> 29 | ff_siwis -> 30 | hf_alpha -> 31 |
| 32 - 35 | hf_beta -> 32 | hm_omega -> 33 | hm_psi -> 34 | if_sara -> 35 |
| 36 - 39 | im_nicola -> 36 | jf_alpha -> 37 | jf_gongitsune -> 38 | jf_nezumi -> 39 |
| 40 - 43 | jf_tebukuro -> 40 | jm_kumo -> 41 | pf_dora -> 42 | pm_alex -> 43 |
| 44 - 47 | pm_santa -> 44 | zf_xiaobei -> 45 | zf_xiaoni -> 46 | zf_xiaoxiao -> 47 |
| 48 - 51 | zf_xiaoyi -> 48 | zm_yunjian -> 49 | zm_yunxi -> 50 | zm_yunxia -> 51 |
| 52 - 52 | zm_yunyang -> 52 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2
You can use the following code to play with kokoro-multi-lang-v1_0
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(
model="kokoro-multi-lang-v1_0/model.onnx",
voices="kokoro-multi-lang-v1_0/voices.bin",
tokens="kokoro-multi-lang-v1_0/tokens.txt",
data_dir="kokoro-multi-lang-v1_0/espeak-ng-data",
lexicon="kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kokoro.model = "kokoro-multi-lang-v1_0/model.onnx";
config.model.kokoro.voices = "kokoro-multi-lang-v1_0/voices.bin";
config.model.kokoro.tokens = "kokoro-multi-lang-v1_0/tokens.txt";
config.model.kokoro.data_dir = "kokoro-multi-lang-v1_0/espeak-ng-data";
config.model.kokoro.lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt";
config.model.num_threads = 1;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 0;
const char *text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
#if 0
// If you don't want to use a callback, then please enable this branch
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
#else
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
#endif
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kokoro.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kokoro \
/tmp/test-kokoro.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kokoro.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kokoro.model = "kokoro-multi-lang-v1_0/model.onnx";
config.model.kokoro.voices = "kokoro-multi-lang-v1_0/voices.bin";
config.model.kokoro.tokens = "kokoro-multi-lang-v1_0/tokens.txt";
config.model.kokoro.data_dir = "kokoro-multi-lang-v1_0/espeak-ng-data";
config.model.kokoro.lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt";
config.model.num_threads = 1;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kokoro.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kokoro \
/tmp/test-kokoro.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kokoro.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kokoro: OfflineTtsKokoroModelConfig {
model: Some("kokoro-multi-lang-v1_0/model.onnx".into()),
voices: Some("kokoro-multi-lang-v1_0/voices.bin".into()),
tokens: Some("kokoro-multi-lang-v1_0/tokens.txt".into()),
data_dir: Some("kokoro-multi-lang-v1_0/espeak-ng-data".into()),
lexicon: Some("kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kokoro-multi-lang-v1_0 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
kokoro: {
model: 'kokoro-multi-lang-v1_0/model.onnx',
voices: 'kokoro-multi-lang-v1_0/voices.bin',
tokens: 'kokoro-multi-lang-v1_0/tokens.txt',
dataDir: 'kokoro-multi-lang-v1_0/espeak-ng-data',
lexicon: 'kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(
model: 'kokoro-multi-lang-v1_0/model.onnx',
voices: 'kokoro-multi-lang-v1_0/voices.bin',
tokens: 'kokoro-multi-lang-v1_0/tokens.txt',
dataDir: 'kokoro-multi-lang-v1_0/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kokoro: kokoro,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with Swift API.
func run() {
let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(
model: "kokoro-multi-lang-v1_0/model.onnx",
voices: "kokoro-multi-lang-v1_0/voices.bin",
tokens: "kokoro-multi-lang-v1_0/tokens.txt",
dataDir: "kokoro-multi-lang-v1_0/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kokoro.Model = "kokoro-multi-lang-v1_0/model.onnx";
config.Model.Kokoro.Voices = "kokoro-multi-lang-v1_0/voices.bin";
config.Model.Kokoro.Tokens = "kokoro-multi-lang-v1_0/tokens.txt";
config.Model.Kokoro.DataDir = "kokoro-multi-lang-v1_0/espeak-ng-data";
config.Model.Kokoro.Lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = ;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kokoro = OfflineTtsKokoroModelConfig(
model = "kokoro-multi-lang-v1_0/model.onnx",
voices = "kokoro-multi-lang-v1_0/voices.bin",
tokens = "kokoro-multi-lang-v1_0/tokens.txt",
dataDir = "kokoro-multi-lang-v1_0/espeak-ng-data",
lexicon = "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = ,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kokoro = new OfflineTtsKokoroModelConfig();
kokoro.setModel("kokoro-multi-lang-v1_0/model.onnx");
kokoro.setVoices("kokoro-multi-lang-v1_0/voices.bin");
kokoro.setTokens("kokoro-multi-lang-v1_0/tokens.txt");
kokoro.setDataDir("kokoro-multi-lang-v1_0/espeak-ng-data");
kokoro.setLexicon("kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKokoro(kokoro);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with Pascal API.
program test_kokoro;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kokoro.Model := 'kokoro-multi-lang-v1_0/model.onnx';
Config.Model.Kokoro.Voices := 'kokoro-multi-lang-v1_0/voices.bin';
Config.Model.Kokoro.Tokens := 'kokoro-multi-lang-v1_0/tokens.txt';
Config.Model.Kokoro.DataDir := 'kokoro-multi-lang-v1_0/espeak-ng-data';
Config.Model.Kokoro.Lexicon := 'kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_0 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kokoro: sherpa.OfflineTtsKokoroModelConfig{
Model: "kokoro-multi-lang-v1_0/model.onnx",
Voices: "kokoro-multi-lang-v1_0/voices.bin",
Tokens: "kokoro-multi-lang-v1_0/tokens.txt",
DataDir: "kokoro-multi-lang-v1_0/espeak-ng-data",
Lexicon: "kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
This model supports both Chinese and English. 小米的核心价值观是什么?答案
是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习.
我在研究 machine learning。What do you think 中英文说的如何呢?
今天是 2025年6月18号.
sample audios for different speakers are listed below:
Speaker 0 - af_alloy
Speaker 1 - af_aoede
Speaker 2 - af_bella
Speaker 3 - af_heart
Speaker 4 - af_jessica
Speaker 5 - af_kore
Speaker 6 - af_nicole
Speaker 7 - af_nova
Speaker 8 - af_river
Speaker 9 - af_sarah
Speaker 10 - af_sky
Speaker 11 - am_adam
Speaker 12 - am_echo
Speaker 13 - am_eric
Speaker 14 - am_fenrir
Speaker 15 - am_liam
Speaker 16 - am_michael
Speaker 17 - am_onyx
Speaker 18 - am_puck
Speaker 19 - am_santa
Speaker 20 - bf_alice
Speaker 21 - bf_emma
Speaker 22 - bf_isabella
Speaker 23 - bf_lily
Speaker 24 - bm_daniel
Speaker 25 - bm_fable
Speaker 26 - bm_george
Speaker 27 - bm_lewis
Speaker 28 - ef_dora
Speaker 29 - em_alex
Speaker 30 - ff_siwis
Speaker 31 - hf_alpha
Speaker 32 - hf_beta
Speaker 33 - hm_omega
Speaker 34 - hm_psi
Speaker 35 - if_sara
Speaker 36 - im_nicola
Speaker 37 - jf_alpha
Speaker 38 - jf_gongitsune
Speaker 39 - jf_nezumi
Speaker 40 - jf_tebukuro
Speaker 41 - jm_kumo
Speaker 42 - pf_dora
Speaker 43 - pm_alex
Speaker 44 - pm_santa
Speaker 45 - zf_xiaobei
Speaker 46 - zf_xiaoni
Speaker 47 - zf_xiaoxiao
Speaker 48 - zf_xiaoyi
Speaker 49 - zm_yunjian
Speaker 50 - zm_yunxi
Speaker 51 - zm_yunxia
Speaker 52 - zm_yunyang
kokoro-multi-lang-v1_1
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kokoro v1.1-zh and it is from https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh
It supports both Chinese and English.
| Number of speakers | Sample rate |
|---|---|
| 103 | 24000 |
Meaning of speaker prefix
| Prefix | Meaning | sid range | Number of speakers |
|---|---|---|---|
| af | American female | 0 - 1 | 2 |
| bf | British female | 2 | 1 |
| zf | Chinese female | 3 - 57 | 55 |
| zm | Chinese male | 58 - 102 | 45 |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 3 | 0 -> af_maple | 1 -> af_sol | 2 -> bf_vale | 3 -> zf_001 |
| 4 - 7 | 4 -> zf_002 | 5 -> zf_003 | 6 -> zf_004 | 7 -> zf_005 |
| 8 - 11 | 8 -> zf_006 | 9 -> zf_007 | 10 -> zf_008 | 11 -> zf_017 |
| 12 - 15 | 12 -> zf_018 | 13 -> zf_019 | 14 -> zf_021 | 15 -> zf_022 |
| 16 - 19 | 16 -> zf_023 | 17 -> zf_024 | 18 -> zf_026 | 19 -> zf_027 |
| 20 - 23 | 20 -> zf_028 | 21 -> zf_032 | 22 -> zf_036 | 23 -> zf_038 |
| 24 - 27 | 24 -> zf_039 | 25 -> zf_040 | 26 -> zf_042 | 27 -> zf_043 |
| 28 - 31 | 28 -> zf_044 | 29 -> zf_046 | 30 -> zf_047 | 31 -> zf_048 |
| 32 - 35 | 32 -> zf_049 | 33 -> zf_051 | 34 -> zf_059 | 35 -> zf_060 |
| 36 - 39 | 36 -> zf_067 | 37 -> zf_070 | 38 -> zf_071 | 39 -> zf_072 |
| 40 - 43 | 40 -> zf_073 | 41 -> zf_074 | 42 -> zf_075 | 43 -> zf_076 |
| 44 - 47 | 44 -> zf_077 | 45 -> zf_078 | 46 -> zf_079 | 47 -> zf_083 |
| 48 - 51 | 48 -> zf_084 | 49 -> zf_085 | 50 -> zf_086 | 51 -> zf_087 |
| 52 - 55 | 52 -> zf_088 | 53 -> zf_090 | 54 -> zf_092 | 55 -> zf_093 |
| 56 - 59 | 56 -> zf_094 | 57 -> zf_099 | 58 -> zm_009 | 59 -> zm_010 |
| 60 - 63 | 60 -> zm_011 | 61 -> zm_012 | 62 -> zm_013 | 63 -> zm_014 |
| 64 - 67 | 64 -> zm_015 | 65 -> zm_016 | 66 -> zm_020 | 67 -> zm_025 |
| 68 - 71 | 68 -> zm_029 | 69 -> zm_030 | 70 -> zm_031 | 71 -> zm_033 |
| 72 - 75 | 72 -> zm_034 | 73 -> zm_035 | 74 -> zm_037 | 75 -> zm_041 |
| 76 - 79 | 76 -> zm_045 | 77 -> zm_050 | 78 -> zm_052 | 79 -> zm_053 |
| 80 - 83 | 80 -> zm_054 | 81 -> zm_055 | 82 -> zm_056 | 83 -> zm_057 |
| 84 - 87 | 84 -> zm_058 | 85 -> zm_061 | 86 -> zm_062 | 87 -> zm_063 |
| 88 - 91 | 88 -> zm_064 | 89 -> zm_065 | 90 -> zm_066 | 91 -> zm_068 |
| 92 - 95 | 92 -> zm_069 | 93 -> zm_080 | 94 -> zm_081 | 95 -> zm_082 |
| 96 - 99 | 96 -> zm_089 | 97 -> zm_091 | 98 -> zm_095 | 99 -> zm_096 |
| 100 - 102 | 100 -> zm_097 | 101 -> zm_098 | 102 -> zm_100 |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 3 | af_maple -> 0 | af_sol -> 1 | bf_vale -> 2 | zf_001 -> 3 |
| 4 - 7 | zf_002 -> 4 | zf_003 -> 5 | zf_004 -> 6 | zf_005 -> 7 |
| 8 - 11 | zf_006 -> 8 | zf_007 -> 9 | zf_008 -> 10 | zf_017 -> 11 |
| 12 - 15 | zf_018 -> 12 | zf_019 -> 13 | zf_021 -> 14 | zf_022 -> 15 |
| 16 - 19 | zf_023 -> 16 | zf_024 -> 17 | zf_026 -> 18 | zf_027 -> 19 |
| 20 - 23 | zf_028 -> 20 | zf_032 -> 21 | zf_036 -> 22 | zf_038 -> 23 |
| 24 - 27 | zf_039 -> 24 | zf_040 -> 25 | zf_042 -> 26 | zf_043 -> 27 |
| 28 - 31 | zf_044 -> 28 | zf_046 -> 29 | zf_047 -> 30 | zf_048 -> 31 |
| 32 - 35 | zf_049 -> 32 | zf_051 -> 33 | zf_059 -> 34 | zf_060 -> 35 |
| 36 - 39 | zf_067 -> 36 | zf_070 -> 37 | zf_071 -> 38 | zf_072 -> 39 |
| 40 - 43 | zf_073 -> 40 | zf_074 -> 41 | zf_075 -> 42 | zf_076 -> 43 |
| 44 - 47 | zf_077 -> 44 | zf_078 -> 45 | zf_079 -> 46 | zf_083 -> 47 |
| 48 - 51 | zf_084 -> 48 | zf_085 -> 49 | zf_086 -> 50 | zf_087 -> 51 |
| 52 - 55 | zf_088 -> 52 | zf_090 -> 53 | zf_092 -> 54 | zf_093 -> 55 |
| 56 - 59 | zf_094 -> 56 | zf_099 -> 57 | zm_009 -> 58 | zm_010 -> 59 |
| 60 - 63 | zm_011 -> 60 | zm_012 -> 61 | zm_013 -> 62 | zm_014 -> 63 |
| 64 - 67 | zm_015 -> 64 | zm_016 -> 65 | zm_020 -> 66 | zm_025 -> 67 |
| 68 - 71 | zm_029 -> 68 | zm_030 -> 69 | zm_031 -> 70 | zm_033 -> 71 |
| 72 - 75 | zm_034 -> 72 | zm_035 -> 73 | zm_037 -> 74 | zm_041 -> 75 |
| 76 - 79 | zm_045 -> 76 | zm_050 -> 77 | zm_052 -> 78 | zm_053 -> 79 |
| 80 - 83 | zm_054 -> 80 | zm_055 -> 81 | zm_056 -> 82 | zm_057 -> 83 |
| 84 - 87 | zm_058 -> 84 | zm_061 -> 85 | zm_062 -> 86 | zm_063 -> 87 |
| 88 - 91 | zm_064 -> 88 | zm_065 -> 89 | zm_066 -> 90 | zm_068 -> 91 |
| 92 - 95 | zm_069 -> 92 | zm_080 -> 93 | zm_081 -> 94 | zm_082 -> 95 |
| 96 - 99 | zm_089 -> 96 | zm_091 -> 97 | zm_095 -> 98 | zm_096 -> 99 |
| 100 - 102 | zm_097 -> 100 | zm_098 -> 101 | zm_100 -> 102 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2
You can use the following code to play with kokoro-multi-lang-v1_1
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(
model="kokoro-multi-lang-v1_1/model.onnx",
voices="kokoro-multi-lang-v1_1/voices.bin",
tokens="kokoro-multi-lang-v1_1/tokens.txt",
data_dir="kokoro-multi-lang-v1_1/espeak-ng-data",
lexicon="kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kokoro.model = "kokoro-multi-lang-v1_1/model.onnx";
config.model.kokoro.voices = "kokoro-multi-lang-v1_1/voices.bin";
config.model.kokoro.tokens = "kokoro-multi-lang-v1_1/tokens.txt";
config.model.kokoro.data_dir = "kokoro-multi-lang-v1_1/espeak-ng-data";
config.model.kokoro.lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";
config.model.num_threads = 1;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 0;
const char *text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
#if 0
// If you don't want to use a callback, then please enable this branch
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
#else
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
#endif
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kokoro.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kokoro \
/tmp/test-kokoro.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kokoro.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kokoro.model = "kokoro-multi-lang-v1_1/model.onnx";
config.model.kokoro.voices = "kokoro-multi-lang-v1_1/voices.bin";
config.model.kokoro.tokens = "kokoro-multi-lang-v1_1/tokens.txt";
config.model.kokoro.data_dir = "kokoro-multi-lang-v1_1/espeak-ng-data";
config.model.kokoro.lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";
config.model.num_threads = 1;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kokoro.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kokoro \
/tmp/test-kokoro.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kokoro.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kokoro: OfflineTtsKokoroModelConfig {
model: Some("kokoro-multi-lang-v1_1/model.onnx".into()),
voices: Some("kokoro-multi-lang-v1_1/voices.bin".into()),
tokens: Some("kokoro-multi-lang-v1_1/tokens.txt".into()),
data_dir: Some("kokoro-multi-lang-v1_1/espeak-ng-data".into()),
lexicon: Some("kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kokoro-multi-lang-v1_1 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
kokoro: {
model: 'kokoro-multi-lang-v1_1/model.onnx',
voices: 'kokoro-multi-lang-v1_1/voices.bin',
tokens: 'kokoro-multi-lang-v1_1/tokens.txt',
dataDir: 'kokoro-multi-lang-v1_1/espeak-ng-data',
lexicon: 'kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(
model: 'kokoro-multi-lang-v1_1/model.onnx',
voices: 'kokoro-multi-lang-v1_1/voices.bin',
tokens: 'kokoro-multi-lang-v1_1/tokens.txt',
dataDir: 'kokoro-multi-lang-v1_1/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kokoro: kokoro,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with Swift API.
func run() {
let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(
model: "kokoro-multi-lang-v1_1/model.onnx",
voices: "kokoro-multi-lang-v1_1/voices.bin",
tokens: "kokoro-multi-lang-v1_1/tokens.txt",
dataDir: "kokoro-multi-lang-v1_1/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kokoro.Model = "kokoro-multi-lang-v1_1/model.onnx";
config.Model.Kokoro.Voices = "kokoro-multi-lang-v1_1/voices.bin";
config.Model.Kokoro.Tokens = "kokoro-multi-lang-v1_1/tokens.txt";
config.Model.Kokoro.DataDir = "kokoro-multi-lang-v1_1/espeak-ng-data";
config.Model.Kokoro.Lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = ;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kokoro = OfflineTtsKokoroModelConfig(
model = "kokoro-multi-lang-v1_1/model.onnx",
voices = "kokoro-multi-lang-v1_1/voices.bin",
tokens = "kokoro-multi-lang-v1_1/tokens.txt",
dataDir = "kokoro-multi-lang-v1_1/espeak-ng-data",
lexicon = "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = ,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kokoro = new OfflineTtsKokoroModelConfig();
kokoro.setModel("kokoro-multi-lang-v1_1/model.onnx");
kokoro.setVoices("kokoro-multi-lang-v1_1/voices.bin");
kokoro.setTokens("kokoro-multi-lang-v1_1/tokens.txt");
kokoro.setDataDir("kokoro-multi-lang-v1_1/espeak-ng-data");
kokoro.setLexicon("kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKokoro(kokoro);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with Pascal API.
program test_kokoro;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kokoro.Model := 'kokoro-multi-lang-v1_1/model.onnx';
Config.Model.Kokoro.Voices := 'kokoro-multi-lang-v1_1/voices.bin';
Config.Model.Kokoro.Tokens := 'kokoro-multi-lang-v1_1/tokens.txt';
Config.Model.Kokoro.DataDir := 'kokoro-multi-lang-v1_1/espeak-ng-data';
Config.Model.Kokoro.Lexicon := 'kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kokoro-multi-lang-v1_1 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kokoro: sherpa.OfflineTtsKokoroModelConfig{
Model: "kokoro-multi-lang-v1_1/model.onnx",
Voices: "kokoro-multi-lang-v1_1/voices.bin",
Tokens: "kokoro-multi-lang-v1_1/tokens.txt",
DataDir: "kokoro-multi-lang-v1_1/espeak-ng-data",
Lexicon: "kokoro-multi-lang-v1_1/lexicon-us-en.txt,kokoro-multi-lang-v1_1/lexicon-zh.txt",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "This model supports both Chinese and English. 小米的核心价值观是什么?答案是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢?今天是 2025年6月18号."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
This model supports both Chinese and English. 小米的核心价值观是什么?答案
是真诚热爱!有困难,请拨打110 或者18601200909。I am learning 机器学习.
我在研究 machine learning。What do you think 中英文说的如何呢?
今天是 2025年6月18号.
sample audios for different speakers are listed below:
Speaker 0 - af_maple
Speaker 1 - af_sol
Speaker 2 - bf_vale
Speaker 3 - zf_001
Speaker 4 - zf_002
Speaker 5 - zf_003
Speaker 6 - zf_004
Speaker 7 - zf_005
Speaker 8 - zf_006
Speaker 9 - zf_007
Speaker 10 - zf_008
Speaker 11 - zf_017
Speaker 12 - zf_018
Speaker 13 - zf_019
Speaker 14 - zf_021
Speaker 15 - zf_022
Speaker 16 - zf_023
Speaker 17 - zf_024
Speaker 18 - zf_026
Speaker 19 - zf_027
Speaker 20 - zf_028
Speaker 21 - zf_032
Speaker 22 - zf_036
Speaker 23 - zf_038
Speaker 24 - zf_039
Speaker 25 - zf_040
Speaker 26 - zf_042
Speaker 27 - zf_043
Speaker 28 - zf_044
Speaker 29 - zf_046
Speaker 30 - zf_047
Speaker 31 - zf_048
Speaker 32 - zf_049
Speaker 33 - zf_051
Speaker 34 - zf_059
Speaker 35 - zf_060
Speaker 36 - zf_067
Speaker 37 - zf_070
Speaker 38 - zf_071
Speaker 39 - zf_072
Speaker 40 - zf_073
Speaker 41 - zf_074
Speaker 42 - zf_075
Speaker 43 - zf_076
Speaker 44 - zf_077
Speaker 45 - zf_078
Speaker 46 - zf_079
Speaker 47 - zf_083
Speaker 48 - zf_084
Speaker 49 - zf_085
Speaker 50 - zf_086
Speaker 51 - zf_087
Speaker 52 - zf_088
Speaker 53 - zf_090
Speaker 54 - zf_092
Speaker 55 - zf_093
Speaker 56 - zf_094
Speaker 57 - zf_099
Speaker 58 - zm_009
Speaker 59 - zm_010
Speaker 60 - zm_011
Speaker 61 - zm_012
Speaker 62 - zm_013
Speaker 63 - zm_014
Speaker 64 - zm_015
Speaker 65 - zm_016
Speaker 66 - zm_020
Speaker 67 - zm_025
Speaker 68 - zm_029
Speaker 69 - zm_030
Speaker 70 - zm_031
Speaker 71 - zm_033
Speaker 72 - zm_034
Speaker 73 - zm_035
Speaker 74 - zm_037
Speaker 75 - zm_041
Speaker 76 - zm_045
Speaker 77 - zm_050
Speaker 78 - zm_052
Speaker 79 - zm_053
Speaker 80 - zm_054
Speaker 81 - zm_055
Speaker 82 - zm_056
Speaker 83 - zm_057
Speaker 84 - zm_058
Speaker 85 - zm_061
Speaker 86 - zm_062
Speaker 87 - zm_063
Speaker 88 - zm_064
Speaker 89 - zm_065
Speaker 90 - zm_066
Speaker 91 - zm_068
Speaker 92 - zm_069
Speaker 93 - zm_080
Speaker 94 - zm_081
Speaker 95 - zm_082
Speaker 96 - zm_089
Speaker 97 - zm_091
Speaker 98 - zm_095
Speaker 99 - zm_096
Speaker 100 - zm_097
Speaker 101 - zm_098
Speaker 102 - zm_100
Arabic
This section lists text to speech models for Arabic.
vits-piper-ar_JO-SA_dii-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "كيف حالك اليوم؟";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ar_JO-SA_dii-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
data_dir="vits-piper-ar_JO-SA_dii-high/espeak-ng-data",
tokens="vits-piper-ar_JO-SA_dii-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "كيف حالك اليوم؟";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx".into()),
tokens: Some("vits-piper-ar_JO-SA_dii-high/tokens.txt".into()),
data_dir: Some("vits-piper-ar_JO-SA_dii-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "كيف حالك اليوم؟";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx',
tokens: 'vits-piper-ar_JO-SA_dii-high/tokens.txt',
dataDir: 'vits-piper-ar_JO-SA_dii-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'كيف حالك اليوم؟';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx',
tokens: 'vits-piper-ar_JO-SA_dii-high/tokens.txt',
dataDir: 'vits-piper-ar_JO-SA_dii-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
lexicon: "",
tokens: "vits-piper-ar_JO-SA_dii-high/tokens.txt",
dataDir: "vits-piper-ar_JO-SA_dii-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "كيف حالك اليوم؟"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
tokens = "vits-piper-ar_JO-SA_dii-high/tokens.txt",
dataDir = "vits-piper-ar_JO-SA_dii-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "كيف حالك اليوم؟",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx");
vits.setTokens("vits-piper-ar_JO-SA_dii-high/tokens.txt");
vits.setDataDir("vits-piper-ar_JO-SA_dii-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ar_JO-SA_dii-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ar_JO-SA_dii-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_dii-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ar_JO-SA_dii-high/ar_JO-SA_dii-high.onnx",
Tokens: "vits-piper-ar_JO-SA_dii-high/tokens.txt",
DataDir: "vits-piper-ar_JO-SA_dii-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "كيف حالك اليوم؟"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
كيف حالك اليوم؟
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ar_JO-SA_miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "كيف حالك اليوم؟";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ar_JO-SA_miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
data_dir="vits-piper-ar_JO-SA_miro-high/espeak-ng-data",
tokens="vits-piper-ar_JO-SA_miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "كيف حالك اليوم؟";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx".into()),
tokens: Some("vits-piper-ar_JO-SA_miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-ar_JO-SA_miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "كيف حالك اليوم؟";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx',
tokens: 'vits-piper-ar_JO-SA_miro-high/tokens.txt',
dataDir: 'vits-piper-ar_JO-SA_miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'كيف حالك اليوم؟';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx',
tokens: 'vits-piper-ar_JO-SA_miro-high/tokens.txt',
dataDir: 'vits-piper-ar_JO-SA_miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
lexicon: "",
tokens: "vits-piper-ar_JO-SA_miro-high/tokens.txt",
dataDir: "vits-piper-ar_JO-SA_miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "كيف حالك اليوم؟"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
tokens = "vits-piper-ar_JO-SA_miro-high/tokens.txt",
dataDir = "vits-piper-ar_JO-SA_miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "كيف حالك اليوم؟",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx");
vits.setTokens("vits-piper-ar_JO-SA_miro-high/tokens.txt");
vits.setDataDir("vits-piper-ar_JO-SA_miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ar_JO-SA_miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ar_JO-SA_miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ar_JO-SA_miro-high/ar_JO-SA_miro-high.onnx",
Tokens: "vits-piper-ar_JO-SA_miro-high/tokens.txt",
DataDir: "vits-piper-ar_JO-SA_miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "كيف حالك اليوم؟"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
كيف حالك اليوم؟
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ar_JO-SA_miro_V2-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "كيف حالك اليوم؟";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
data_dir="vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data",
tokens="vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "كيف حالك اليوم؟";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx".into()),
tokens: Some("vits-piper-ar_JO-SA_miro_V2-high/tokens.txt".into()),
data_dir: Some("vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "كيف حالك اليوم؟";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx',
tokens: 'vits-piper-ar_JO-SA_miro_V2-high/tokens.txt',
dataDir: 'vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'كيف حالك اليوم؟';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx',
tokens: 'vits-piper-ar_JO-SA_miro_V2-high/tokens.txt',
dataDir: 'vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
lexicon: "",
tokens: "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
dataDir: "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "كيف حالك اليوم؟"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
tokens = "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
dataDir = "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "كيف حالك اليوم؟",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx");
vits.setTokens("vits-piper-ar_JO-SA_miro_V2-high/tokens.txt");
vits.setDataDir("vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ar_JO-SA_miro_V2-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ar_JO-SA_miro_V2-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ar_JO-SA_miro_V2-high/ar_JO-SA_miro_V2-high.onnx",
Tokens: "vits-piper-ar_JO-SA_miro_V2-high/tokens.txt",
DataDir: "vits-piper-ar_JO-SA_miro_V2-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "كيف حالك اليوم؟"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
كيف حالك اليوم؟
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ar_JO-kareem-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ar/ar_JO/kareem/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-kareem-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-kareem-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "كيف حالك اليوم؟";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ar_JO-kareem-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
data_dir="vits-piper-ar_JO-kareem-low/espeak-ng-data",
tokens="vits-piper-ar_JO-kareem-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-kareem-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-kareem-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "كيف حالك اليوم؟";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx".into()),
tokens: Some("vits-piper-ar_JO-kareem-low/tokens.txt".into()),
data_dir: Some("vits-piper-ar_JO-kareem-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "كيف حالك اليوم؟";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ar_JO-kareem-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx',
tokens: 'vits-piper-ar_JO-kareem-low/tokens.txt',
dataDir: 'vits-piper-ar_JO-kareem-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'كيف حالك اليوم؟';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx',
tokens: 'vits-piper-ar_JO-kareem-low/tokens.txt',
dataDir: 'vits-piper-ar_JO-kareem-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
lexicon: "",
tokens: "vits-piper-ar_JO-kareem-low/tokens.txt",
dataDir: "vits-piper-ar_JO-kareem-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "كيف حالك اليوم؟"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-kareem-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-kareem-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
tokens = "vits-piper-ar_JO-kareem-low/tokens.txt",
dataDir = "vits-piper-ar_JO-kareem-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "كيف حالك اليوم؟",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx");
vits.setTokens("vits-piper-ar_JO-kareem-low/tokens.txt");
vits.setDataDir("vits-piper-ar_JO-kareem-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ar_JO-kareem-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ar_JO-kareem-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ar_JO-kareem-low/ar_JO-kareem-low.onnx",
Tokens: "vits-piper-ar_JO-kareem-low/tokens.txt",
DataDir: "vits-piper-ar_JO-kareem-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "كيف حالك اليوم؟"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
كيف حالك اليوم؟
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ar_JO-kareem-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ar/ar_JO/kareem/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "كيف حالك اليوم؟";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ar_JO-kareem-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
data_dir="vits-piper-ar_JO-kareem-medium/espeak-ng-data",
tokens="vits-piper-ar_JO-kareem-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="كيف حالك اليوم؟",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx";
config.model.vits.tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "كيف حالك اليوم؟";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx".into()),
tokens: Some("vits-piper-ar_JO-kareem-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ar_JO-kareem-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "كيف حالك اليوم؟";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ar_JO-kareem-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx',
tokens: 'vits-piper-ar_JO-kareem-medium/tokens.txt',
dataDir: 'vits-piper-ar_JO-kareem-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'كيف حالك اليوم؟';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx',
tokens: 'vits-piper-ar_JO-kareem-medium/tokens.txt',
dataDir: 'vits-piper-ar_JO-kareem-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'كيف حالك اليوم؟', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
lexicon: "",
tokens: "vits-piper-ar_JO-kareem-medium/tokens.txt",
dataDir: "vits-piper-ar_JO-kareem-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "كيف حالك اليوم؟"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
tokens = "vits-piper-ar_JO-kareem-medium/tokens.txt",
dataDir = "vits-piper-ar_JO-kareem-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "كيف حالك اليوم؟",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx");
vits.setTokens("vits-piper-ar_JO-kareem-medium/tokens.txt");
vits.setDataDir("vits-piper-ar_JO-kareem-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "كيف حالك اليوم؟";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ar_JO-kareem-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ar_JO-kareem-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('كيف حالك اليوم؟', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ar_JO-kareem-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ar_JO-kareem-medium/ar_JO-kareem-medium.onnx",
Tokens: "vits-piper-ar_JO-kareem-medium/tokens.txt",
DataDir: "vits-piper-ar_JO-kareem-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "كيف حالك اليوم؟"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
كيف حالك اليوم؟
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-ar
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Arabic (ar).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ar"
audio = tts.generate("هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"ar\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "ar"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "ar"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'ar'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'ar'},
);
final audio = tts.generateWithConfig(text: 'هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "ar"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ar\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "ar"),
)
val audio = tts.generateWithConfigAndCallback(
text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"ar\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "ar"}';
Audio := Tts.GenerateWithConfig('هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "ar"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 1
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 2
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 3
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 4
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 5
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 6
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 7
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 8
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Speaker 9
0
مرحبا بالعالم.
1
كيف حالك اليوم؟
2
السماء زرقاء والهواء لطيف.
3
يساعد التعلم الآلي الحواسيب على فهم البيانات.
4
تحول تقنية تحويل النص إلى كلام الجمل إلى صوت واضح.
5
قرأ الطلاب قصة قصيرة في المكتبة صباحا.
6
أعلن القطار عن تأخير بسيط بسبب أعمال الصيانة.
7
تعمل النماذج الصغيرة بسرعة على الأجهزة المحلية.
8
يساعد المساعد الصوتي المستخدمين في المهام اليومية.
9
تحتاج الأنظمة الحديثة إلى قراءة مستقرة للنصوص الطويلة.
Albanian
This section lists text to speech models for Albanian.
vits-piper-sq_AL-edon-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sq/sq_AL/edon/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx";
config.model.vits.tokens = "vits-piper-sq_AL-edon-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sq_AL-edon-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sq_AL-edon-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
data_dir="vits-piper-sq_AL-edon-medium/espeak-ng-data",
tokens="vits-piper-sq_AL-edon-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Çdo fillim është i vështirë, por çdo fund është i bukur.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx";
config.model.vits.tokens = "vits-piper-sq_AL-edon-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sq_AL-edon-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx".into()),
tokens: Some("vits-piper-sq_AL-edon-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sq_AL-edon-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sq_AL-edon-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx',
tokens: 'vits-piper-sq_AL-edon-medium/tokens.txt',
dataDir: 'vits-piper-sq_AL-edon-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Çdo fillim është i vështirë, por çdo fund është i bukur.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx',
tokens: 'vits-piper-sq_AL-edon-medium/tokens.txt',
dataDir: 'vits-piper-sq_AL-edon-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Çdo fillim është i vështirë, por çdo fund është i bukur.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
lexicon: "",
tokens: "vits-piper-sq_AL-edon-medium/tokens.txt",
dataDir: "vits-piper-sq_AL-edon-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Çdo fillim është i vështirë, por çdo fund është i bukur."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sq_AL-edon-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sq_AL-edon-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
tokens = "vits-piper-sq_AL-edon-medium/tokens.txt",
dataDir = "vits-piper-sq_AL-edon-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Çdo fillim është i vështirë, por çdo fund është i bukur.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx");
vits.setTokens("vits-piper-sq_AL-edon-medium/tokens.txt");
vits.setDataDir("vits-piper-sq_AL-edon-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Çdo fillim është i vështirë, por çdo fund është i bukur.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sq_AL-edon-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sq_AL-edon-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Çdo fillim është i vështirë, por çdo fund është i bukur.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sq_AL-edon-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sq_AL-edon-medium/sq_AL-edon-medium.onnx",
Tokens: "vits-piper-sq_AL-edon-medium/tokens.txt",
DataDir: "vits-piper-sq_AL-edon-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Çdo fillim është i vështirë, por çdo fund është i bukur."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Çdo fillim është i vështirë, por çdo fund është i bukur.
sample audios for different speakers are listed below:
Speaker 0
Basque
This section lists text to speech models for Basque.
vits-piper-eu_ES-antton-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/eu/eu_ES/antton/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx";
config.model.vits.tokens = "vits-piper-eu_ES-antton-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-eu_ES-antton-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Aberats izatea baino, izen ona hobe.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-eu_ES-antton-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
data_dir="vits-piper-eu_ES-antton-medium/espeak-ng-data",
tokens="vits-piper-eu_ES-antton-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Aberats izatea baino, izen ona hobe.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx";
config.model.vits.tokens = "vits-piper-eu_ES-antton-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-eu_ES-antton-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Aberats izatea baino, izen ona hobe.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx".into()),
tokens: Some("vits-piper-eu_ES-antton-medium/tokens.txt".into()),
data_dir: Some("vits-piper-eu_ES-antton-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Aberats izatea baino, izen ona hobe.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-eu_ES-antton-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx',
tokens: 'vits-piper-eu_ES-antton-medium/tokens.txt',
dataDir: 'vits-piper-eu_ES-antton-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Aberats izatea baino, izen ona hobe.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx',
tokens: 'vits-piper-eu_ES-antton-medium/tokens.txt',
dataDir: 'vits-piper-eu_ES-antton-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Aberats izatea baino, izen ona hobe.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
lexicon: "",
tokens: "vits-piper-eu_ES-antton-medium/tokens.txt",
dataDir: "vits-piper-eu_ES-antton-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Aberats izatea baino, izen ona hobe."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-eu_ES-antton-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-eu_ES-antton-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Aberats izatea baino, izen ona hobe.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
tokens = "vits-piper-eu_ES-antton-medium/tokens.txt",
dataDir = "vits-piper-eu_ES-antton-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Aberats izatea baino, izen ona hobe.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx");
vits.setTokens("vits-piper-eu_ES-antton-medium/tokens.txt");
vits.setDataDir("vits-piper-eu_ES-antton-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Aberats izatea baino, izen ona hobe.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-eu_ES-antton-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-eu_ES-antton-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Aberats izatea baino, izen ona hobe.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-eu_ES-antton-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-eu_ES-antton-medium/eu_ES-antton-medium.onnx",
Tokens: "vits-piper-eu_ES-antton-medium/tokens.txt",
DataDir: "vits-piper-eu_ES-antton-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Aberats izatea baino, izen ona hobe."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Aberats izatea baino, izen ona hobe.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-eu_ES-maider-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/eu/eu_ES/maider/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx";
config.model.vits.tokens = "vits-piper-eu_ES-maider-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-eu_ES-maider-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Aberats izatea baino, izen ona hobe.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-eu_ES-maider-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
data_dir="vits-piper-eu_ES-maider-medium/espeak-ng-data",
tokens="vits-piper-eu_ES-maider-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Aberats izatea baino, izen ona hobe.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx";
config.model.vits.tokens = "vits-piper-eu_ES-maider-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-eu_ES-maider-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Aberats izatea baino, izen ona hobe.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx".into()),
tokens: Some("vits-piper-eu_ES-maider-medium/tokens.txt".into()),
data_dir: Some("vits-piper-eu_ES-maider-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Aberats izatea baino, izen ona hobe.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-eu_ES-maider-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx',
tokens: 'vits-piper-eu_ES-maider-medium/tokens.txt',
dataDir: 'vits-piper-eu_ES-maider-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Aberats izatea baino, izen ona hobe.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx',
tokens: 'vits-piper-eu_ES-maider-medium/tokens.txt',
dataDir: 'vits-piper-eu_ES-maider-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Aberats izatea baino, izen ona hobe.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
lexicon: "",
tokens: "vits-piper-eu_ES-maider-medium/tokens.txt",
dataDir: "vits-piper-eu_ES-maider-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Aberats izatea baino, izen ona hobe."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-eu_ES-maider-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-eu_ES-maider-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Aberats izatea baino, izen ona hobe.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
tokens = "vits-piper-eu_ES-maider-medium/tokens.txt",
dataDir = "vits-piper-eu_ES-maider-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Aberats izatea baino, izen ona hobe.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx");
vits.setTokens("vits-piper-eu_ES-maider-medium/tokens.txt");
vits.setDataDir("vits-piper-eu_ES-maider-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Aberats izatea baino, izen ona hobe.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-eu_ES-maider-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-eu_ES-maider-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Aberats izatea baino, izen ona hobe.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-eu_ES-maider-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-eu_ES-maider-medium/eu_ES-maider-medium.onnx",
Tokens: "vits-piper-eu_ES-maider-medium/tokens.txt",
DataDir: "vits-piper-eu_ES-maider-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Aberats izatea baino, izen ona hobe."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Aberats izatea baino, izen ona hobe.
sample audios for different speakers are listed below:
Speaker 0
Bulgarian
This section lists text to speech models for Bulgarian.
supertonic-3-bg
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Bulgarian (bg).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "bg"
audio = tts.generate("Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"bg\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "bg"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "bg"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'bg'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'bg'},
);
final audio = tts.generateWithConfig(text: 'Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "bg"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"bg\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "bg"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"bg\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "bg"}';
Audio := Tts.GenerateWithConfig('Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "bg"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 1
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 2
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 3
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 4
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 5
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 6
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 7
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 8
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Speaker 9
0
Здравей свят.
1
Как си днес?
2
Небето е синьо, а вятърът е тих.
3
Машинното обучение помага на компютрите да учат от данни.
4
Синтезът на реч превръща текст в ясен звук.
5
Учениците прочетоха кратка история в библиотеката.
6
Влакът закъсня заради поддръжка на релсите.
7
Малките модели работят бързо на локални устройства.
8
Гласовите асистенти улесняват ежедневните задачи.
9
Стабилното четене е важно за дълги и кратки изречения.
Catalan
This section lists text to speech models for Catalan.
vits-piper-ca_ES-upc_ona-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ca/ca_ES/upc_ona/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx";
config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Si vols estar ben servit, fes-te tu mateix el llit";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
data_dir="vits-piper-ca_ES-upc_ona-medium/espeak-ng-data",
tokens="vits-piper-ca_ES-upc_ona-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Si vols estar ben servit, fes-te tu mateix el llit",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx";
config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Si vols estar ben servit, fes-te tu mateix el llit";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx".into()),
tokens: Some("vits-piper-ca_ES-upc_ona-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ca_ES-upc_ona-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Si vols estar ben servit, fes-te tu mateix el llit";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx',
tokens: 'vits-piper-ca_ES-upc_ona-medium/tokens.txt',
dataDir: 'vits-piper-ca_ES-upc_ona-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Si vols estar ben servit, fes-te tu mateix el llit';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx',
tokens: 'vits-piper-ca_ES-upc_ona-medium/tokens.txt',
dataDir: 'vits-piper-ca_ES-upc_ona-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Si vols estar ben servit, fes-te tu mateix el llit', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
lexicon: "",
tokens: "vits-piper-ca_ES-upc_ona-medium/tokens.txt",
dataDir: "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Si vols estar ben servit, fes-te tu mateix el llit"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
tokens = "vits-piper-ca_ES-upc_ona-medium/tokens.txt",
dataDir = "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Si vols estar ben servit, fes-te tu mateix el llit",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx");
vits.setTokens("vits-piper-ca_ES-upc_ona-medium/tokens.txt");
vits.setDataDir("vits-piper-ca_ES-upc_ona-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ca_ES-upc_ona-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ca_ES-upc_ona-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Si vols estar ben servit, fes-te tu mateix el llit', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ca_ES-upc_ona-medium/ca_ES-upc_ona-medium.onnx",
Tokens: "vits-piper-ca_ES-upc_ona-medium/tokens.txt",
DataDir: "vits-piper-ca_ES-upc_ona-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Si vols estar ben servit, fes-te tu mateix el llit"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Si vols estar ben servit, fes-te tu mateix el llit
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ca_ES-upc_ona-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ca/ca_ES/upc_ona/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx";
config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Si vols estar ben servit, fes-te tu mateix el llit";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
data_dir="vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data",
tokens="vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Si vols estar ben servit, fes-te tu mateix el llit",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx";
config.model.vits.tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Si vols estar ben servit, fes-te tu mateix el llit";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx".into()),
tokens: Some("vits-piper-ca_ES-upc_ona-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Si vols estar ben servit, fes-te tu mateix el llit";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx',
tokens: 'vits-piper-ca_ES-upc_ona-x_low/tokens.txt',
dataDir: 'vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Si vols estar ben servit, fes-te tu mateix el llit';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx',
tokens: 'vits-piper-ca_ES-upc_ona-x_low/tokens.txt',
dataDir: 'vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Si vols estar ben servit, fes-te tu mateix el llit', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
lexicon: "",
tokens: "vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
dataDir: "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Si vols estar ben servit, fes-te tu mateix el llit"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
tokens = "vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
dataDir = "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Si vols estar ben servit, fes-te tu mateix el llit",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx");
vits.setTokens("vits-piper-ca_ES-upc_ona-x_low/tokens.txt");
vits.setDataDir("vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ca_ES-upc_ona-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Si vols estar ben servit, fes-te tu mateix el llit', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_ona-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ca_ES-upc_ona-x_low/ca_ES-upc_ona-x_low.onnx",
Tokens: "vits-piper-ca_ES-upc_ona-x_low/tokens.txt",
DataDir: "vits-piper-ca_ES-upc_ona-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Si vols estar ben servit, fes-te tu mateix el llit"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Si vols estar ben servit, fes-te tu mateix el llit
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ca_ES-upc_pau-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ca/ca_ES/upc_pau/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx";
config.model.vits.tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Si vols estar ben servit, fes-te tu mateix el llit";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
data_dir="vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data",
tokens="vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Si vols estar ben servit, fes-te tu mateix el llit",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx";
config.model.vits.tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Si vols estar ben servit, fes-te tu mateix el llit";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx".into()),
tokens: Some("vits-piper-ca_ES-upc_pau-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Si vols estar ben servit, fes-te tu mateix el llit";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx',
tokens: 'vits-piper-ca_ES-upc_pau-x_low/tokens.txt',
dataDir: 'vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Si vols estar ben servit, fes-te tu mateix el llit';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx',
tokens: 'vits-piper-ca_ES-upc_pau-x_low/tokens.txt',
dataDir: 'vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Si vols estar ben servit, fes-te tu mateix el llit', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
lexicon: "",
tokens: "vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
dataDir: "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Si vols estar ben servit, fes-te tu mateix el llit"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
tokens = "vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
dataDir = "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Si vols estar ben servit, fes-te tu mateix el llit",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx");
vits.setTokens("vits-piper-ca_ES-upc_pau-x_low/tokens.txt");
vits.setDataDir("vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Si vols estar ben servit, fes-te tu mateix el llit";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ca_ES-upc_pau-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Si vols estar ben servit, fes-te tu mateix el llit', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ca_ES-upc_pau-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ca_ES-upc_pau-x_low/ca_ES-upc_pau-x_low.onnx",
Tokens: "vits-piper-ca_ES-upc_pau-x_low/tokens.txt",
DataDir: "vits-piper-ca_ES-upc_pau-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Si vols estar ben servit, fes-te tu mateix el llit"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Si vols estar ben servit, fes-te tu mateix el llit
sample audios for different speakers are listed below:
Speaker 0
Chinese
This section lists text to speech models for Chinese.
vits-piper-zh_CN-chaowen-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/zh/zh_CN/chaowen/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx";
config.model.vits.lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt";
config.model.vits.tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-zh_CN-chaowen-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
lexicon="vits-piper-zh_CN-chaowen-medium/lexicon.txt",
tokens="vits-piper-zh_CN-chaowen-medium/tokens.txt",
),
num_threads=1,
),
rule_fsts="vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst",
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx";
config.model.vits.lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt";
config.model.vits.tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst";
std::string filename = "./test.wav";
std::string text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx".into()),
tokens: Some("vits-piper-zh_CN-chaowen-medium/tokens.txt".into()),
lexicon: Some("vits-piper-zh_CN-chaowen-medium/lexicon.txt".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
rule_fsts: Some("vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst".into()),
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx',
tokens: 'vits-piper-zh_CN-chaowen-medium/tokens.txt',
lexicon: 'vits-piper-zh_CN-chaowen-medium/lexicon.txt',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
ruleFsts: 'vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst',
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx',
tokens: 'vits-piper-zh_CN-chaowen-medium/tokens.txt',
dataDir: '',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
lexicon: "",
tokens: "vits-piper-zh_CN-chaowen-medium/tokens.txt",
dataDir: ""
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt";
config.Model.Vits.Lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
tokens = "vits-piper-zh_CN-chaowen-medium/tokens.txt",
lexicon = "vits-piper-zh_CN-chaowen-medium/lexicon.txt",
),
numThreads = 1,
debug = true,
),
ruleFsts = "vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst",
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx");
vits.setTokens("vits-piper-zh_CN-chaowen-medium/tokens.txt");
vits.setLexicon("vits-piper-zh_CN-chaowen-medium/lexicon.txt");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setRuleFsts("vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst");
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-zh_CN-chaowen-medium/tokens.txt';
Config.Model.Vits.Lexicon := 'vits-piper-zh_CN-chaowen-medium/lexicon.txt';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.RuleFsts := 'vits-piper-zh_CN-chaowen-medium/phone.fst,vits-piper-zh_CN-chaowen-medium/date.fst,vits-piper-zh_CN-chaowen-medium/number.fst';
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-zh_CN-chaowen-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-zh_CN-chaowen-medium/zh_CN-chaowen-medium.onnx",
Tokens: "vits-piper-zh_CN-chaowen-medium/tokens.txt",
DataDir: "",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-zh_CN-xiao_ya-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/zh/zh_CN/xiao_ya/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx";
config.model.vits.lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt";
config.model.vits.tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
lexicon="vits-piper-zh_CN-xiao_ya-medium/lexicon.txt",
tokens="vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
),
num_threads=1,
),
rule_fsts="vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst",
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx";
config.model.vits.lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt";
config.model.vits.tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst";
std::string filename = "./test.wav";
std::string text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx".into()),
tokens: Some("vits-piper-zh_CN-xiao_ya-medium/tokens.txt".into()),
lexicon: Some("vits-piper-zh_CN-xiao_ya-medium/lexicon.txt".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
rule_fsts: Some("vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst".into()),
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx',
tokens: 'vits-piper-zh_CN-xiao_ya-medium/tokens.txt',
lexicon: 'vits-piper-zh_CN-xiao_ya-medium/lexicon.txt',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
ruleFsts: 'vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst',
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx',
tokens: 'vits-piper-zh_CN-xiao_ya-medium/tokens.txt',
dataDir: '',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
lexicon: "",
tokens: "vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
dataDir: ""
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt";
config.Model.Vits.Lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
tokens = "vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
lexicon = "vits-piper-zh_CN-xiao_ya-medium/lexicon.txt",
),
numThreads = 1,
debug = true,
),
ruleFsts = "vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst",
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx");
vits.setTokens("vits-piper-zh_CN-xiao_ya-medium/tokens.txt");
vits.setLexicon("vits-piper-zh_CN-xiao_ya-medium/lexicon.txt");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setRuleFsts("vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst");
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-zh_CN-xiao_ya-medium/tokens.txt';
Config.Model.Vits.Lexicon := 'vits-piper-zh_CN-xiao_ya-medium/lexicon.txt';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.RuleFsts := 'vits-piper-zh_CN-xiao_ya-medium/phone.fst,vits-piper-zh_CN-xiao_ya-medium/date.fst,vits-piper-zh_CN-xiao_ya-medium/number.fst';
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-zh_CN-xiao_ya-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-zh_CN-xiao_ya-medium/zh_CN-xiao_ya-medium.onnx",
Tokens: "vits-piper-zh_CN-xiao_ya-medium/tokens.txt",
DataDir: "",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.
sample audios for different speakers are listed below:
Speaker 0
matcha-icefall-zh-baker
| Info about this model | Download the model | HF Space | Android APK | Python API |
| C API | C++ API | Rust API | Node.js API | Dart API |
| Swift API | C# API | Kotlin API | Java API | Pascal API |
| Go API | Samples |
Info about this model
This model is trained using the code from https://github.com/k2-fsa/icefall/tree/master/egs/baker_zh/TTS/matcha
It supports only Chinese.
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
You need to download the acoustic model and the vocoder model.
Download the acoustic model
Please use the following code to download the model:
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2
tar xvf matcha-icefall-zh-baker.tar.bz2
rm matcha-icefall-zh-baker.tar.bz2
You should see the following output:
ls -lh matcha-icefall-zh-baker/
total 150848
-rw-r--r--@ 1 fangjun staff 58K 6 Oct 08:39 date.fst
drwxr-xr-x@ 10 fangjun staff 320B 18 Feb 2025 dict
-rw-r--r--@ 1 fangjun staff 1.3M 6 Oct 08:39 lexicon.txt
-rw-r--r--@ 1 fangjun staff 72M 6 Oct 08:39 model-steps-3.onnx
-rw-r--r--@ 1 fangjun staff 63K 6 Oct 08:39 number.fst
-rw-r--r--@ 1 fangjun staff 87K 6 Oct 08:39 phone.fst
-rw-r--r--@ 1 fangjun staff 370B 6 Oct 08:39 README.md
-rw-r--r--@ 1 fangjun staff 19K 6 Oct 08:39 tokens.txt
Note: The
dictdirectory is no longer needed for this model.
Download the vocoder model
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx
You should see the following output
ls -lh vocos-22khz-univ.onnx
-rw-r--r--@ 1 fangjun staff 51M 17 Mar 2025 vocos-22khz-univ.onnx
Huggingface space
You can try this model by visiting https://huggingface.co/spaces/k2-fsa/text-to-speech
Huggingface space (WebAssembly, wasm)
You can try this model by visiting
https://huggingface.co/spaces/k2-fsa/web-assembly-zh-tts-matcha
The source code is available at https://github.com/k2-fsa/sherpa-onnx/tree/master/wasm/tts
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
The following code shows how to use the Python API of sherpa-onnx with this model.
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(
acoustic_model="matcha-icefall-zh-baker/model-steps-3.onnx",
vocoder="vocos-22khz-univ.onnx",
lexicon="matcha-icefall-zh-baker/lexicon.txt",
tokens="matcha-icefall-zh-baker/tokens.txt",
),
num_threads=2,
debug=True, # set it False to disable debug output
),
max_num_sentences=1,
rule_fsts="matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
audio = tts.generate(text, sid=0, speed=1.0)
sf.write(
"./test.mp3",
audio.samples,
samplerate=audio.sample_rate,
)
You can save it as test-zh.py and then run:
pip install sherpa-onnx soundfile
python3 ./test-zh.py
You will get a file test.mp3 in the end.
C API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker using C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.matcha.acoustic_model = "matcha-icefall-zh-baker/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
config.model.matcha.lexicon = "matcha-icefall-zh-baker/lexicon.txt";
config.model.matcha.tokens = "matcha-icefall-zh-baker/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-zh.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-zh \
/tmp/test-zh.c
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-zh.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker using C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.matcha.acoustic_model = "matcha-icefall-zh-baker/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
config.model.matcha.lexicon = "matcha-icefall-zh-baker/lexicon.txt";
config.model.matcha.tokens = "matcha-icefall-zh-baker/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";
std::string filename = "./test.wav";
std::string text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-zh.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-zh \
/tmp/test-zh.cc
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-zh.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
matcha: OfflineTtsMatchaModelConfig {
acoustic_model: Some("matcha-icefall-zh-baker/model-steps-3.onnx".into()),
vocoder: Some("vocos-22khz-univ.onnx".into()),
tokens: Some("matcha-icefall-zh-baker/tokens.txt".into()),
lexicon: Some("matcha-icefall-zh-baker/lexicon.txt".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
rule_fsts: Some("matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst".into()),
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with matcha-icefall-zh-baker with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
matcha: {
acousticModel: 'matcha-icefall-zh-baker/model-steps-3.onnx',
vocoder: 'vocos-22khz-univ.onnx',
tokens: 'matcha-icefall-zh-baker/tokens.txt',
lexicon: 'matcha-icefall-zh-baker/lexicon.txt',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
ruleFsts: 'matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst',
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(
acousticModel: 'matcha-icefall-zh-baker/model-steps-3.onnx',
vocoder: 'vocos-22khz-univ.onnx',
tokens: 'matcha-icefall-zh-baker/tokens.txt',
lexicon: 'matcha-icefall-zh-baker/lexicon.txt',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
matcha: matcha,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Swift API.
func run() {
let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(
acousticModel: "matcha-icefall-zh-baker/model-steps-3.onnx",
vocoder: "vocos-22khz-univ.onnx",
tokens: "matcha-icefall-zh-baker/tokens.txt",
dataDir: "",
lexicon: "matcha-icefall-zh-baker/lexicon.txt"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Matcha.AcousticModel = "matcha-icefall-zh-baker/model-steps-3.onnx";
config.Model.Matcha.Vocoder = "vocos-22khz-univ.onnx";
config.Model.Matcha.Tokens = "matcha-icefall-zh-baker/tokens.txt";
config.Model.Matcha.Lexicon = "matcha-icefall-zh-baker/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
matcha = OfflineTtsMatchaModelConfig(
acousticModel = "matcha-icefall-zh-baker/model-steps-3.onnx",
vocoder = "vocos-22khz-univ.onnx",
tokens = "matcha-icefall-zh-baker/tokens.txt",
lexicon = "matcha-icefall-zh-baker/lexicon.txt",
),
numThreads = 1,
debug = true,
),
ruleFsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var matcha = new OfflineTtsMatchaModelConfig();
matcha.setAcousticModel("matcha-icefall-zh-baker/model-steps-3.onnx");
matcha.setVocoder("vocos-22khz-univ.onnx");
matcha.setTokens("matcha-icefall-zh-baker/tokens.txt");
matcha.setLexicon("matcha-icefall-zh-baker/lexicon.txt");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setMatcha(matcha);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setRuleFsts("matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst");
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Pascal API.
program test_matcha;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Matcha.AcousticModel := 'matcha-icefall-zh-baker/model-steps-3.onnx';
Config.Model.Matcha.Vocoder := 'vocos-22khz-univ.onnx';
Config.Model.Matcha.Tokens := 'matcha-icefall-zh-baker/tokens.txt';
Config.Model.Matcha.Lexicon := 'matcha-icefall-zh-baker/lexicon.txt';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.RuleFsts := 'matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst';
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Matcha: sherpa.OfflineTtsMatchaModelConfig{
AcousticModel: "matcha-icefall-zh-baker/model-steps-3.onnx",
Vocoder: "vocos-22khz-univ.onnx",
Tokens: "matcha-icefall-zh-baker/tokens.txt",
Lexicon: "matcha-icefall-zh-baker/lexicon.txt",
},
NumThreads: 1,
Debug: true,
},
RuleFsts: "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.
sample audios for different speakers are listed below:
Speaker 0
Croatian
This section lists text to speech models for Croatian.
supertonic-3-hr
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Croatian (hr).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "hr"
audio = tts.generate("Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"hr\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "hr"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "hr"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'hr'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'hr'},
);
final audio = tts.generateWithConfig(text: 'Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "hr"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"hr\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "hr"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"hr\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "hr"}';
Audio := Tts.GenerateWithConfig('Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "hr"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 1
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 2
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 3
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 4
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 5
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 6
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 7
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 8
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Speaker 9
0
Pozdrav svijete.
1
Kako si danas?
2
Nebo je plavo, a vjetar je blag.
3
Strojno učenje pomaže računalima učiti iz podataka.
4
Sinteza govora pretvara tekst u jasan zvuk.
5
Učenici su u knjižnici pročitali kratku priču.
6
Vlak je kasnio zbog održavanja pruge.
7
Mali modeli brzo rade na lokalnim uređajima.
8
Glasovni asistent pomaže u svakodnevnim zadacima.
9
Stabilno čitanje važno je za kratke i duge rečenice.
Czech
This section lists text to speech models for Czech.
vits-piper-cs_CZ-jirka-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cs/cs_CZ/jirka/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx";
config.model.vits.tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-cs_CZ-jirka-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
data_dir="vits-piper-cs_CZ-jirka-low/espeak-ng-data",
tokens="vits-piper-cs_CZ-jirka-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Co můžeš udělat dnes, neodkládej na zítřek. ",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx";
config.model.vits.tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx".into()),
tokens: Some("vits-piper-cs_CZ-jirka-low/tokens.txt".into()),
data_dir: Some("vits-piper-cs_CZ-jirka-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-cs_CZ-jirka-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx',
tokens: 'vits-piper-cs_CZ-jirka-low/tokens.txt',
dataDir: 'vits-piper-cs_CZ-jirka-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Co můžeš udělat dnes, neodkládej na zítřek. ';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx',
tokens: 'vits-piper-cs_CZ-jirka-low/tokens.txt',
dataDir: 'vits-piper-cs_CZ-jirka-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Co můžeš udělat dnes, neodkládej na zítřek. ', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
lexicon: "",
tokens: "vits-piper-cs_CZ-jirka-low/tokens.txt",
dataDir: "vits-piper-cs_CZ-jirka-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Co můžeš udělat dnes, neodkládej na zítřek. "
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx";
config.Model.Vits.Tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
tokens = "vits-piper-cs_CZ-jirka-low/tokens.txt",
dataDir = "vits-piper-cs_CZ-jirka-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Co můžeš udělat dnes, neodkládej na zítřek. ",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx");
vits.setTokens("vits-piper-cs_CZ-jirka-low/tokens.txt");
vits.setDataDir("vits-piper-cs_CZ-jirka-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-cs_CZ-jirka-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-cs_CZ-jirka-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Co můžeš udělat dnes, neodkládej na zítřek. ', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-cs_CZ-jirka-low/cs_CZ-jirka-low.onnx",
Tokens: "vits-piper-cs_CZ-jirka-low/tokens.txt",
DataDir: "vits-piper-cs_CZ-jirka-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Co můžeš udělat dnes, neodkládej na zítřek. "
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Co můžeš udělat dnes, neodkládej na zítřek.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-cs_CZ-jirka-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cs/cs_CZ/jirka/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx";
config.model.vits.tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-cs_CZ-jirka-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
data_dir="vits-piper-cs_CZ-jirka-medium/espeak-ng-data",
tokens="vits-piper-cs_CZ-jirka-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Co můžeš udělat dnes, neodkládej na zítřek. ",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx";
config.model.vits.tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx".into()),
tokens: Some("vits-piper-cs_CZ-jirka-medium/tokens.txt".into()),
data_dir: Some("vits-piper-cs_CZ-jirka-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx',
tokens: 'vits-piper-cs_CZ-jirka-medium/tokens.txt',
dataDir: 'vits-piper-cs_CZ-jirka-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Co můžeš udělat dnes, neodkládej na zítřek. ';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx',
tokens: 'vits-piper-cs_CZ-jirka-medium/tokens.txt',
dataDir: 'vits-piper-cs_CZ-jirka-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Co můžeš udělat dnes, neodkládej na zítřek. ', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
lexicon: "",
tokens: "vits-piper-cs_CZ-jirka-medium/tokens.txt",
dataDir: "vits-piper-cs_CZ-jirka-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Co můžeš udělat dnes, neodkládej na zítřek. "
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
tokens = "vits-piper-cs_CZ-jirka-medium/tokens.txt",
dataDir = "vits-piper-cs_CZ-jirka-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Co můžeš udělat dnes, neodkládej na zítřek. ",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx");
vits.setTokens("vits-piper-cs_CZ-jirka-medium/tokens.txt");
vits.setDataDir("vits-piper-cs_CZ-jirka-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Co můžeš udělat dnes, neodkládej na zítřek. ";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-cs_CZ-jirka-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-cs_CZ-jirka-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Co můžeš udělat dnes, neodkládej na zítřek. ', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-cs_CZ-jirka-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-cs_CZ-jirka-medium/cs_CZ-jirka-medium.onnx",
Tokens: "vits-piper-cs_CZ-jirka-medium/tokens.txt",
DataDir: "vits-piper-cs_CZ-jirka-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Co můžeš udělat dnes, neodkládej na zítřek. "
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Co můžeš udělat dnes, neodkládej na zítřek.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-cs
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Czech (cs).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "cs"
audio = tts.generate("Toto je převodník textu na řeč využívající novou generaci kaldi", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Toto je převodník textu na řeč využívající novou generaci kaldi";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"cs\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Toto je převodník textu na řeč využívající novou generaci kaldi";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "cs"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Toto je převodník textu na řeč využívající novou generaci kaldi";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "cs"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Toto je převodník textu na řeč využívající novou generaci kaldi';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'cs'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'cs'},
);
final audio = tts.generateWithConfig(text: 'Toto je převodník textu na řeč využívající novou generaci kaldi', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Toto je převodník textu na řeč využívající novou generaci kaldi"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "cs"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Toto je převodník textu na řeč využívající novou generaci kaldi";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"cs\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "cs"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Toto je převodník textu na řeč využívající novou generaci kaldi",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Toto je převodník textu na řeč využívající novou generaci kaldi";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"cs\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "cs"}';
Audio := Tts.GenerateWithConfig('Toto je převodník textu na řeč využívající novou generaci kaldi', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Toto je převodník textu na řeč využívající novou generaci kaldi"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "cs"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 1
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 2
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 3
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 4
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 5
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 6
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 7
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 8
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Speaker 9
0
Ahoj světe.
1
Jak se dnes máš?
2
Obloha je modrá a vítr je mírný.
3
Strojové učení pomáhá počítačům učit se z dat.
4
Syntéza řeči převádí text na srozumitelný zvuk.
5
Studenti četli krátký příběh v knihovně.
6
Vlak měl zpoždění kvůli údržbě trati.
7
Malé modely běží rychle na místních zařízeních.
8
Hlasový asistent pomáhá s každodenními úkoly.
9
Stabilní čtení je důležité pro dlouhé i krátké věty.
Danish
This section lists text to speech models for Danish.
vits-piper-da_DK-talesyntese-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/da/da_DK/talesyntese/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx";
config.model.vits.tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-da_DK-talesyntese-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
data_dir="vits-piper-da_DK-talesyntese-medium/espeak-ng-data",
tokens="vits-piper-da_DK-talesyntese-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx";
config.model.vits.tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx".into()),
tokens: Some("vits-piper-da_DK-talesyntese-medium/tokens.txt".into()),
data_dir: Some("vits-piper-da_DK-talesyntese-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx',
tokens: 'vits-piper-da_DK-talesyntese-medium/tokens.txt',
dataDir: 'vits-piper-da_DK-talesyntese-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx',
tokens: 'vits-piper-da_DK-talesyntese-medium/tokens.txt',
dataDir: 'vits-piper-da_DK-talesyntese-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
lexicon: "",
tokens: "vits-piper-da_DK-talesyntese-medium/tokens.txt",
dataDir: "vits-piper-da_DK-talesyntese-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
tokens = "vits-piper-da_DK-talesyntese-medium/tokens.txt",
dataDir = "vits-piper-da_DK-talesyntese-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx");
vits.setTokens("vits-piper-da_DK-talesyntese-medium/tokens.txt");
vits.setDataDir("vits-piper-da_DK-talesyntese-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-da_DK-talesyntese-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-da_DK-talesyntese-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-da_DK-talesyntese-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-da_DK-talesyntese-medium/da_DK-talesyntese-medium.onnx",
Tokens: "vits-piper-da_DK-talesyntese-medium/tokens.txt",
DataDir: "vits-piper-da_DK-talesyntese-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-da
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Danish (da).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "da"
audio = tts.generate("Dette er en tekst til tale-motor, der bruger næste generation af kaldi", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"da\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "da"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "da"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Dette er en tekst til tale-motor, der bruger næste generation af kaldi';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'da'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'da'},
);
final audio = tts.generateWithConfig(text: 'Dette er en tekst til tale-motor, der bruger næste generation af kaldi', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "da"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"da\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "da"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Dette er en tekst til tale-motor, der bruger næste generation af kaldi";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"da\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "da"}';
Audio := Tts.GenerateWithConfig('Dette er en tekst til tale-motor, der bruger næste generation af kaldi', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Dette er en tekst til tale-motor, der bruger næste generation af kaldi"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "da"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 1
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 2
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 3
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 4
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 5
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 6
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 7
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 8
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Speaker 9
0
Hej verden.
1
Hvordan har du det i dag?
2
Himlen er blå, og vinden er mild.
3
Maskinlæring hjælper computere med at lære af data.
4
Talesyntese omdanner tekst til tydelig lyd.
5
Eleverne læste en kort historie på biblioteket.
6
Toget blev forsinket på grund af sporarbejde.
7
Små modeller kører hurtigt på lokale enheder.
8
En stemmeassistent hjælper med daglige opgaver.
9
Stabil oplæsning er vigtig for både korte og lange sætninger.
Dutch
This section lists text to speech models for Dutch.
vits-piper-nl_BE-nathalie-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_BE/nathalie/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "God schiep het water, maar de Nederlander schiep de dijk";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-nl_BE-nathalie-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
data_dir="vits-piper-nl_BE-nathalie-medium/espeak-ng-data",
tokens="vits-piper-nl_BE-nathalie-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "God schiep het water, maar de Nederlander schiep de dijk";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx".into()),
tokens: Some("vits-piper-nl_BE-nathalie-medium/tokens.txt".into()),
data_dir: Some("vits-piper-nl_BE-nathalie-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "God schiep het water, maar de Nederlander schiep de dijk";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx',
tokens: 'vits-piper-nl_BE-nathalie-medium/tokens.txt',
dataDir: 'vits-piper-nl_BE-nathalie-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'God schiep het water, maar de Nederlander schiep de dijk';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx',
tokens: 'vits-piper-nl_BE-nathalie-medium/tokens.txt',
dataDir: 'vits-piper-nl_BE-nathalie-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
lexicon: "",
tokens: "vits-piper-nl_BE-nathalie-medium/tokens.txt",
dataDir: "vits-piper-nl_BE-nathalie-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "God schiep het water, maar de Nederlander schiep de dijk"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
tokens = "vits-piper-nl_BE-nathalie-medium/tokens.txt",
dataDir = "vits-piper-nl_BE-nathalie-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "God schiep het water, maar de Nederlander schiep de dijk",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx");
vits.setTokens("vits-piper-nl_BE-nathalie-medium/tokens.txt");
vits.setDataDir("vits-piper-nl_BE-nathalie-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-nl_BE-nathalie-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-nl_BE-nathalie-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-nl_BE-nathalie-medium/nl_BE-nathalie-medium.onnx",
Tokens: "vits-piper-nl_BE-nathalie-medium/tokens.txt",
DataDir: "vits-piper-nl_BE-nathalie-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "God schiep het water, maar de Nederlander schiep de dijk"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
God schiep het water, maar de Nederlander schiep de dijk
sample audios for different speakers are listed below:
Speaker 0
vits-piper-nl_BE-nathalie-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_BE/nathalie/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx";
config.model.vits.tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "God schiep het water, maar de Nederlander schiep de dijk";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
data_dir="vits-piper-nl_BE-nathalie-x_low/espeak-ng-data",
tokens="vits-piper-nl_BE-nathalie-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx";
config.model.vits.tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "God schiep het water, maar de Nederlander schiep de dijk";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx".into()),
tokens: Some("vits-piper-nl_BE-nathalie-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-nl_BE-nathalie-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "God schiep het water, maar de Nederlander schiep de dijk";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx',
tokens: 'vits-piper-nl_BE-nathalie-x_low/tokens.txt',
dataDir: 'vits-piper-nl_BE-nathalie-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'God schiep het water, maar de Nederlander schiep de dijk';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx',
tokens: 'vits-piper-nl_BE-nathalie-x_low/tokens.txt',
dataDir: 'vits-piper-nl_BE-nathalie-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
lexicon: "",
tokens: "vits-piper-nl_BE-nathalie-x_low/tokens.txt",
dataDir: "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "God schiep het water, maar de Nederlander schiep de dijk"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
tokens = "vits-piper-nl_BE-nathalie-x_low/tokens.txt",
dataDir = "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "God schiep het water, maar de Nederlander schiep de dijk",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx");
vits.setTokens("vits-piper-nl_BE-nathalie-x_low/tokens.txt");
vits.setDataDir("vits-piper-nl_BE-nathalie-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-nl_BE-nathalie-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-nl_BE-nathalie-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-nl_BE-nathalie-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-nl_BE-nathalie-x_low/nl_BE-nathalie-x_low.onnx",
Tokens: "vits-piper-nl_BE-nathalie-x_low/tokens.txt",
DataDir: "vits-piper-nl_BE-nathalie-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "God schiep het water, maar de Nederlander schiep de dijk"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
God schiep het water, maar de Nederlander schiep de dijk
sample audios for different speakers are listed below:
Speaker 0
vits-piper-nl_NL-alex-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_NL/alex/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-alex-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-alex-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "God schiep het water, maar de Nederlander schiep de dijk";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-nl_NL-alex-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
data_dir="vits-piper-nl_NL-alex-medium/espeak-ng-data",
tokens="vits-piper-nl_NL-alex-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-alex-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-alex-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "God schiep het water, maar de Nederlander schiep de dijk";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx".into()),
tokens: Some("vits-piper-nl_NL-alex-medium/tokens.txt".into()),
data_dir: Some("vits-piper-nl_NL-alex-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "God schiep het water, maar de Nederlander schiep de dijk";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-nl_NL-alex-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx',
tokens: 'vits-piper-nl_NL-alex-medium/tokens.txt',
dataDir: 'vits-piper-nl_NL-alex-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'God schiep het water, maar de Nederlander schiep de dijk';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx',
tokens: 'vits-piper-nl_NL-alex-medium/tokens.txt',
dataDir: 'vits-piper-nl_NL-alex-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
lexicon: "",
tokens: "vits-piper-nl_NL-alex-medium/tokens.txt",
dataDir: "vits-piper-nl_NL-alex-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "God schiep het water, maar de Nederlander schiep de dijk"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-alex-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-alex-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
tokens = "vits-piper-nl_NL-alex-medium/tokens.txt",
dataDir = "vits-piper-nl_NL-alex-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "God schiep het water, maar de Nederlander schiep de dijk",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx");
vits.setTokens("vits-piper-nl_NL-alex-medium/tokens.txt");
vits.setDataDir("vits-piper-nl_NL-alex-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-nl_NL-alex-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-nl_NL-alex-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-nl_NL-alex-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-nl_NL-alex-medium/nl_NL-alex-medium.onnx",
Tokens: "vits-piper-nl_NL-alex-medium/tokens.txt",
DataDir: "vits-piper-nl_NL-alex-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "God schiep het water, maar de Nederlander schiep de dijk"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
God schiep het water, maar de Nederlander schiep de dijk
sample audios for different speakers are listed below:
Speaker 0
vits-piper-nl_NL-dii-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_dii
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-dii-high.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "God schiep het water, maar de Nederlander schiep de dijk";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-nl_NL-dii-high.tar.bz2
You can use the following code to play with vits-piper-nl_NL-dii-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
data_dir="vits-piper-nl_NL-dii-high/espeak-ng-data",
tokens="vits-piper-nl_NL-dii-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "God schiep het water, maar de Nederlander schiep de dijk";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx".into()),
tokens: Some("vits-piper-nl_NL-dii-high/tokens.txt".into()),
data_dir: Some("vits-piper-nl_NL-dii-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "God schiep het water, maar de Nederlander schiep de dijk";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-nl_NL-dii-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx',
tokens: 'vits-piper-nl_NL-dii-high/tokens.txt',
dataDir: 'vits-piper-nl_NL-dii-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'God schiep het water, maar de Nederlander schiep de dijk';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx',
tokens: 'vits-piper-nl_NL-dii-high/tokens.txt',
dataDir: 'vits-piper-nl_NL-dii-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
lexicon: "",
tokens: "vits-piper-nl_NL-dii-high/tokens.txt",
dataDir: "vits-piper-nl_NL-dii-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "God schiep het water, maar de Nederlander schiep de dijk"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
tokens = "vits-piper-nl_NL-dii-high/tokens.txt",
dataDir = "vits-piper-nl_NL-dii-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "God schiep het water, maar de Nederlander schiep de dijk",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx");
vits.setTokens("vits-piper-nl_NL-dii-high/tokens.txt");
vits.setDataDir("vits-piper-nl_NL-dii-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-nl_NL-dii-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-nl_NL-dii-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-nl_NL-dii-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-nl_NL-dii-high/nl_NL-dii-high.onnx",
Tokens: "vits-piper-nl_NL-dii-high/tokens.txt",
DataDir: "vits-piper-nl_NL-dii-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "God schiep het water, maar de Nederlander schiep de dijk"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
God schiep het water, maar de Nederlander schiep de dijk
sample audios for different speakers are listed below:
Speaker 0
vits-piper-nl_NL-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "God schiep het water, maar de Nederlander schiep de dijk";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-nl_NL-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
data_dir="vits-piper-nl_NL-miro-high/espeak-ng-data",
tokens="vits-piper-nl_NL-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "God schiep het water, maar de Nederlander schiep de dijk";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx".into()),
tokens: Some("vits-piper-nl_NL-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-nl_NL-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "God schiep het water, maar de Nederlander schiep de dijk";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-nl_NL-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx',
tokens: 'vits-piper-nl_NL-miro-high/tokens.txt',
dataDir: 'vits-piper-nl_NL-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'God schiep het water, maar de Nederlander schiep de dijk';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx',
tokens: 'vits-piper-nl_NL-miro-high/tokens.txt',
dataDir: 'vits-piper-nl_NL-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-nl_NL-miro-high/tokens.txt",
dataDir: "vits-piper-nl_NL-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "God schiep het water, maar de Nederlander schiep de dijk"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
tokens = "vits-piper-nl_NL-miro-high/tokens.txt",
dataDir = "vits-piper-nl_NL-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "God schiep het water, maar de Nederlander schiep de dijk",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx");
vits.setTokens("vits-piper-nl_NL-miro-high/tokens.txt");
vits.setDataDir("vits-piper-nl_NL-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-nl_NL-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-nl_NL-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-nl_NL-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-nl_NL-miro-high/nl_NL-miro-high.onnx",
Tokens: "vits-piper-nl_NL-miro-high/tokens.txt",
DataDir: "vits-piper-nl_NL-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "God schiep het water, maar de Nederlander schiep de dijk"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
God schiep het water, maar de Nederlander schiep de dijk
sample audios for different speakers are listed below:
Speaker 0
vits-piper-nl_NL-pim-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_NL/pim/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-pim-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-pim-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "God schiep het water, maar de Nederlander schiep de dijk";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-nl_NL-pim-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
data_dir="vits-piper-nl_NL-pim-medium/espeak-ng-data",
tokens="vits-piper-nl_NL-pim-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-pim-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-pim-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "God schiep het water, maar de Nederlander schiep de dijk";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx".into()),
tokens: Some("vits-piper-nl_NL-pim-medium/tokens.txt".into()),
data_dir: Some("vits-piper-nl_NL-pim-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "God schiep het water, maar de Nederlander schiep de dijk";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-nl_NL-pim-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx',
tokens: 'vits-piper-nl_NL-pim-medium/tokens.txt',
dataDir: 'vits-piper-nl_NL-pim-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'God schiep het water, maar de Nederlander schiep de dijk';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx',
tokens: 'vits-piper-nl_NL-pim-medium/tokens.txt',
dataDir: 'vits-piper-nl_NL-pim-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
lexicon: "",
tokens: "vits-piper-nl_NL-pim-medium/tokens.txt",
dataDir: "vits-piper-nl_NL-pim-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "God schiep het water, maar de Nederlander schiep de dijk"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-pim-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-pim-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
tokens = "vits-piper-nl_NL-pim-medium/tokens.txt",
dataDir = "vits-piper-nl_NL-pim-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "God schiep het water, maar de Nederlander schiep de dijk",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx");
vits.setTokens("vits-piper-nl_NL-pim-medium/tokens.txt");
vits.setDataDir("vits-piper-nl_NL-pim-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-nl_NL-pim-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-nl_NL-pim-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-nl_NL-pim-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-nl_NL-pim-medium/nl_NL-pim-medium.onnx",
Tokens: "vits-piper-nl_NL-pim-medium/tokens.txt",
DataDir: "vits-piper-nl_NL-pim-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "God schiep het water, maar de Nederlander schiep de dijk"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
God schiep het water, maar de Nederlander schiep de dijk
sample audios for different speakers are listed below:
Speaker 0
vits-piper-nl_NL-ronnie-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/nl/nl_NL/ronnie/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "God schiep het water, maar de Nederlander schiep de dijk";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-nl_NL-ronnie-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
data_dir="vits-piper-nl_NL-ronnie-medium/espeak-ng-data",
tokens="vits-piper-nl_NL-ronnie-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="God schiep het water, maar de Nederlander schiep de dijk",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx";
config.model.vits.tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "God schiep het water, maar de Nederlander schiep de dijk";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx".into()),
tokens: Some("vits-piper-nl_NL-ronnie-medium/tokens.txt".into()),
data_dir: Some("vits-piper-nl_NL-ronnie-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "God schiep het water, maar de Nederlander schiep de dijk";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx',
tokens: 'vits-piper-nl_NL-ronnie-medium/tokens.txt',
dataDir: 'vits-piper-nl_NL-ronnie-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'God schiep het water, maar de Nederlander schiep de dijk';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx',
tokens: 'vits-piper-nl_NL-ronnie-medium/tokens.txt',
dataDir: 'vits-piper-nl_NL-ronnie-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'God schiep het water, maar de Nederlander schiep de dijk', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
lexicon: "",
tokens: "vits-piper-nl_NL-ronnie-medium/tokens.txt",
dataDir: "vits-piper-nl_NL-ronnie-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "God schiep het water, maar de Nederlander schiep de dijk"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
tokens = "vits-piper-nl_NL-ronnie-medium/tokens.txt",
dataDir = "vits-piper-nl_NL-ronnie-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "God schiep het water, maar de Nederlander schiep de dijk",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx");
vits.setTokens("vits-piper-nl_NL-ronnie-medium/tokens.txt");
vits.setDataDir("vits-piper-nl_NL-ronnie-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "God schiep het water, maar de Nederlander schiep de dijk";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-nl_NL-ronnie-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-nl_NL-ronnie-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('God schiep het water, maar de Nederlander schiep de dijk', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-nl_NL-ronnie-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-nl_NL-ronnie-medium/nl_NL-ronnie-medium.onnx",
Tokens: "vits-piper-nl_NL-ronnie-medium/tokens.txt",
DataDir: "vits-piper-nl_NL-ronnie-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "God schiep het water, maar de Nederlander schiep de dijk"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
God schiep het water, maar de Nederlander schiep de dijk
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-nl
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Dutch (nl).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "nl"
audio = tts.generate("Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"nl\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "nl"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "nl"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'nl'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'nl'},
);
final audio = tts.generateWithConfig(text: 'Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "nl"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"nl\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "nl"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"nl\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "nl"}';
Audio := Tts.GenerateWithConfig('Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "nl"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 1
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 2
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 3
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 4
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 5
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 6
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 7
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 8
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
Speaker 9
0
Hallo wereld.
1
Hoe gaat het vandaag?
2
De lucht is blauw en de wind is zacht.
3
Machine learning helpt computers om van gegevens te leren.
4
Spraaksynthese zet tekst om in duidelijke audio.
5
De leerlingen lazen een kort verhaal in de bibliotheek.
6
De trein had vertraging door onderhoud aan het spoor.
7
Kleine modellen draaien snel op lokale apparaten.
8
Een stemassistent helpt bij dagelijkse taken.
9
Stabiel voorlezen is belangrijk voor korte en lange zinnen.
English
This section lists text to speech models for English.
matcha-icefall-en_US-ljspeech
| Info about this model | Download the model | HF Space | Android APK | Python API |
| C API | C++ API | Rust API | Node.js API | Dart API |
| Swift API | C# API | Kotlin API | Java API | Pascal API |
| Go API | Samples |
Info about this model
This model is trained using the code from https://github.com/k2-fsa/icefall/tree/master/egs/ljspeech/TTS/matcha
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
You need to download the acoustic model and the vocoder model.
Download the acoustic model
Please use the following code to download the model:
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2
tar xvf matcha-icefall-en_US-ljspeech.tar.bz2
rm matcha-icefall-en_US-ljspeech.tar.bz2
You should see the following output:
ls -lh matcha-icefall-en_US-ljspeech/
total 144856
-rw-r--r-- 1 fangjun staff 251B Jan 2 11:05 README.md
drwxr-xr-x 122 fangjun staff 3.8K Nov 28 2023 espeak-ng-data
-rw-r--r--@ 1 fangjun staff 71M Jan 2 04:04 model-steps-3.onnx
-rw-r--r-- 1 fangjun staff 954B Jan 2 11:05 tokens.txt
Download the vocoder model
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx
You should see the following output
ls -lh vocos-22khz-univ.onnx
-rw-r--r--@ 1 fangjun staff 51M 17 Mar 2025 vocos-22khz-univ.onnx
Huggingface space
You can try this model by visiting https://huggingface.co/spaces/k2-fsa/text-to-speech
Huggingface space (WebAssembly, wasm)
You can try this model by visiting
https://huggingface.co/spaces/k2-fsa/web-assembly-en-tts-matcha
The source code is available at https://github.com/k2-fsa/sherpa-onnx/tree/master/wasm/tts
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
The following code shows how to use the Python API of sherpa-onnx with this model.
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(
acoustic_model="matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
vocoder="vocos-22khz-univ.onnx",
tokens="matcha-icefall-en_US-ljspeech/tokens.txt",
data_dir="matcha-icefall-en_US-ljspeech/espeak-ng-data",
),
num_threads=2,
debug=True, # set it False to disable debug output
),
max_num_sentences=1,
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
audio = tts.generate(text, sid=0, speed=1.0)
sf.write(
"./test.mp3",
audio.samples,
samplerate=audio.sample_rate,
)
You can save it as test-en.py and then run:
pip install sherpa-onnx soundfile
python3 ./test-en.py
You will get a file test.mp3 in the end.
C API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech using C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.matcha.acoustic_model = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
config.model.matcha.tokens = "matcha-icefall-en_US-ljspeech/tokens.txt";
config.model.matcha.data_dir = "matcha-icefall-en_US-ljspeech/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-en.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-en \
/tmp/test-en.c
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-en
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-en.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech using C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.matcha.acoustic_model = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
config.model.matcha.tokens = "matcha-icefall-en_US-ljspeech/tokens.txt";
config.model.matcha.data_dir = "matcha-icefall-en_US-ljspeech/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-en.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-en \
/tmp/test-en.cc
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-en
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-en.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
matcha: OfflineTtsMatchaModelConfig {
acoustic_model: Some("matcha-icefall-en_US-ljspeech/model-steps-3.onnx".into()),
vocoder: Some("vocos-22khz-univ.onnx".into()),
tokens: Some("matcha-icefall-en_US-ljspeech/tokens.txt".into()),
data_dir: Some("matcha-icefall-en_US-ljspeech/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with matcha-icefall-en_US-ljspeech with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
matcha: {
acousticModel: 'matcha-icefall-en_US-ljspeech/model-steps-3.onnx',
vocoder: 'vocos-22khz-univ.onnx',
tokens: 'matcha-icefall-en_US-ljspeech/tokens.txt',
dataDir: 'matcha-icefall-en_US-ljspeech/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(
acousticModel: 'matcha-icefall-en_US-ljspeech/model-steps-3.onnx',
vocoder: 'vocos-22khz-univ.onnx',
tokens: 'matcha-icefall-en_US-ljspeech/tokens.txt',
dataDir: 'matcha-icefall-en_US-ljspeech/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
matcha: matcha,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with Swift API.
func run() {
let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(
acousticModel: "matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
vocoder: "vocos-22khz-univ.onnx",
tokens: "matcha-icefall-en_US-ljspeech/tokens.txt",
dataDir: "matcha-icefall-en_US-ljspeech/espeak-ng-data",
lexicon: ""
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Matcha.AcousticModel = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
config.Model.Matcha.Vocoder = "vocos-22khz-univ.onnx";
config.Model.Matcha.Tokens = "matcha-icefall-en_US-ljspeech/tokens.txt";
config.Model.Matcha.DataDir = "matcha-icefall-en_US-ljspeech/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
matcha = OfflineTtsMatchaModelConfig(
acousticModel = "matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
vocoder = "vocos-22khz-univ.onnx",
tokens = "matcha-icefall-en_US-ljspeech/tokens.txt",
dataDir = "matcha-icefall-en_US-ljspeech/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var matcha = new OfflineTtsMatchaModelConfig();
matcha.setAcousticModel("matcha-icefall-en_US-ljspeech/model-steps-3.onnx");
matcha.setVocoder("vocos-22khz-univ.onnx");
matcha.setTokens("matcha-icefall-en_US-ljspeech/tokens.txt");
matcha.setDataDir("matcha-icefall-en_US-ljspeech/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setMatcha(matcha);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with Pascal API.
program test_matcha;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Matcha.AcousticModel := 'matcha-icefall-en_US-ljspeech/model-steps-3.onnx';
Config.Model.Matcha.Vocoder := 'vocos-22khz-univ.onnx';
Config.Model.Matcha.Tokens := 'matcha-icefall-en_US-ljspeech/tokens.txt';
Config.Model.Matcha.DataDir := 'matcha-icefall-en_US-ljspeech/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with matcha-icefall-en_US-ljspeech with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Matcha: sherpa.OfflineTtsMatchaModelConfig{
AcousticModel: "matcha-icefall-en_US-ljspeech/model-steps-3.onnx",
Vocoder: "vocos-22khz-univ.onnx",
Tokens: "matcha-icefall-en_US-ljspeech/tokens.txt",
DataDir: "matcha-icefall-en_US-ljspeech/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
kitten-nano-en-v0_1-fp16
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-nano-0.1 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.1
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with ./kitten-nano-en-v0_1-fp16
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
voices="./kitten-nano-en-v0_1-fp16/voices.bin",
tokens="./kitten-nano-en-v0_1-fp16/tokens.txt",
data_dir="./kitten-nano-en-v0_1-fp16/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_1-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_1-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-nano-en-v0_1-fp16/model.fp16.onnx".into()),
voices: Some("./kitten-nano-en-v0_1-fp16/voices.bin".into()),
tokens: Some("./kitten-nano-en-v0_1-fp16/tokens.txt".into()),
data_dir: Some("./kitten-nano-en-v0_1-fp16/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-nano-en-v0_1-fp16 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-nano-en-v0_1-fp16/model.fp16.onnx',
voices: './kitten-nano-en-v0_1-fp16/voices.bin',
tokens: './kitten-nano-en-v0_1-fp16/tokens.txt',
dataDir: './kitten-nano-en-v0_1-fp16/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-nano-en-v0_1-fp16/model.fp16.onnx',
voices: './kitten-nano-en-v0_1-fp16/voices.bin',
tokens: './kitten-nano-en-v0_1-fp16/tokens.txt',
dataDir: './kitten-nano-en-v0_1-fp16/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
voices: "./kitten-nano-en-v0_1-fp16/voices.bin",
tokens: "./kitten-nano-en-v0_1-fp16/tokens.txt",
dataDir: "./kitten-nano-en-v0_1-fp16/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_1-fp16/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
voices = "./kitten-nano-en-v0_1-fp16/voices.bin",
tokens = "./kitten-nano-en-v0_1-fp16/tokens.txt",
dataDir = "./kitten-nano-en-v0_1-fp16/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-nano-en-v0_1-fp16/model.fp16.onnx");
kitten.setVoices("./kitten-nano-en-v0_1-fp16/voices.bin");
kitten.setTokens("./kitten-nano-en-v0_1-fp16/tokens.txt");
kitten.setDataDir("./kitten-nano-en-v0_1-fp16/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-nano-en-v0_1-fp16/model.fp16.onnx';
Config.Model.Kitten.Voices := './kitten-nano-en-v0_1-fp16/voices.bin';
Config.Model.Kitten.Tokens := './kitten-nano-en-v0_1-fp16/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-nano-en-v0_1-fp16/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-nano-en-v0_1-fp16 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-nano-en-v0_1-fp16/model.fp16.onnx",
Voices: "./kitten-nano-en-v0_1-fp16/voices.bin",
Tokens: "./kitten-nano-en-v0_1-fp16/tokens.txt",
DataDir: "./kitten-nano-en-v0_1-fp16/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - expr-voice-2-m
Speaker 1 - expr-voice-2-f
Speaker 2 - expr-voice-3-m
Speaker 3 - expr-voice-3-f
Speaker 4 - expr-voice-4-m
Speaker 5 - expr-voice-4-f
Speaker 6 - expr-voice-5-m
Speaker 7 - expr-voice-5-f
kitten-nano-en-v0_2-fp16
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-nano-0.2 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.2
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_2-fp16.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with ./kitten-nano-en-v0_2-fp16
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
voices="./kitten-nano-en-v0_2-fp16/voices.bin",
tokens="./kitten-nano-en-v0_2-fp16/tokens.txt",
data_dir="./kitten-nano-en-v0_2-fp16/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_2-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_2-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-nano-en-v0_2-fp16/model.fp16.onnx".into()),
voices: Some("./kitten-nano-en-v0_2-fp16/voices.bin".into()),
tokens: Some("./kitten-nano-en-v0_2-fp16/tokens.txt".into()),
data_dir: Some("./kitten-nano-en-v0_2-fp16/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-nano-en-v0_2-fp16 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-nano-en-v0_2-fp16/model.fp16.onnx',
voices: './kitten-nano-en-v0_2-fp16/voices.bin',
tokens: './kitten-nano-en-v0_2-fp16/tokens.txt',
dataDir: './kitten-nano-en-v0_2-fp16/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-nano-en-v0_2-fp16/model.fp16.onnx',
voices: './kitten-nano-en-v0_2-fp16/voices.bin',
tokens: './kitten-nano-en-v0_2-fp16/tokens.txt',
dataDir: './kitten-nano-en-v0_2-fp16/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
voices: "./kitten-nano-en-v0_2-fp16/voices.bin",
tokens: "./kitten-nano-en-v0_2-fp16/tokens.txt",
dataDir: "./kitten-nano-en-v0_2-fp16/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_2-fp16/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
voices = "./kitten-nano-en-v0_2-fp16/voices.bin",
tokens = "./kitten-nano-en-v0_2-fp16/tokens.txt",
dataDir = "./kitten-nano-en-v0_2-fp16/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-nano-en-v0_2-fp16/model.fp16.onnx");
kitten.setVoices("./kitten-nano-en-v0_2-fp16/voices.bin");
kitten.setTokens("./kitten-nano-en-v0_2-fp16/tokens.txt");
kitten.setDataDir("./kitten-nano-en-v0_2-fp16/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-nano-en-v0_2-fp16/model.fp16.onnx';
Config.Model.Kitten.Voices := './kitten-nano-en-v0_2-fp16/voices.bin';
Config.Model.Kitten.Tokens := './kitten-nano-en-v0_2-fp16/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-nano-en-v0_2-fp16/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-nano-en-v0_2-fp16 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-nano-en-v0_2-fp16/model.fp16.onnx",
Voices: "./kitten-nano-en-v0_2-fp16/voices.bin",
Tokens: "./kitten-nano-en-v0_2-fp16/tokens.txt",
DataDir: "./kitten-nano-en-v0_2-fp16/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - expr-voice-2-m
Speaker 1 - expr-voice-2-f
Speaker 2 - expr-voice-3-m
Speaker 3 - expr-voice-3-f
Speaker 4 - expr-voice-4-m
Speaker 5 - expr-voice-4-f
Speaker 6 - expr-voice-5-m
Speaker 7 - expr-voice-5-f
kitten-mini-en-v0_1-fp16
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-mini-0.1 and it is from https://huggingface.co/KittenML/kitten-tts-mini-0.1
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_1-fp16.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with ./kitten-mini-en-v0_1-fp16
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
voices="./kitten-mini-en-v0_1-fp16/voices.bin",
tokens="./kitten-mini-en-v0_1-fp16/tokens.txt",
data_dir="./kitten-mini-en-v0_1-fp16/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-mini-en-v0_1-fp16/model.fp16.onnx".into()),
voices: Some("./kitten-mini-en-v0_1-fp16/voices.bin".into()),
tokens: Some("./kitten-mini-en-v0_1-fp16/tokens.txt".into()),
data_dir: Some("./kitten-mini-en-v0_1-fp16/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-mini-en-v0_1-fp16 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-mini-en-v0_1-fp16/model.fp16.onnx',
voices: './kitten-mini-en-v0_1-fp16/voices.bin',
tokens: './kitten-mini-en-v0_1-fp16/tokens.txt',
dataDir: './kitten-mini-en-v0_1-fp16/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-mini-en-v0_1-fp16/model.fp16.onnx',
voices: './kitten-mini-en-v0_1-fp16/voices.bin',
tokens: './kitten-mini-en-v0_1-fp16/tokens.txt',
dataDir: './kitten-mini-en-v0_1-fp16/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
voices: "./kitten-mini-en-v0_1-fp16/voices.bin",
tokens: "./kitten-mini-en-v0_1-fp16/tokens.txt",
dataDir: "./kitten-mini-en-v0_1-fp16/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
config.Model.Kitten.Voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
config.Model.Kitten.Tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
voices = "./kitten-mini-en-v0_1-fp16/voices.bin",
tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt",
dataDir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-mini-en-v0_1-fp16/model.fp16.onnx");
kitten.setVoices("./kitten-mini-en-v0_1-fp16/voices.bin");
kitten.setTokens("./kitten-mini-en-v0_1-fp16/tokens.txt");
kitten.setDataDir("./kitten-mini-en-v0_1-fp16/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-mini-en-v0_1-fp16/model.fp16.onnx';
Config.Model.Kitten.Voices := './kitten-mini-en-v0_1-fp16/voices.bin';
Config.Model.Kitten.Tokens := './kitten-mini-en-v0_1-fp16/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-mini-en-v0_1-fp16/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
Voices: "./kitten-mini-en-v0_1-fp16/voices.bin",
Tokens: "./kitten-mini-en-v0_1-fp16/tokens.txt",
DataDir: "./kitten-mini-en-v0_1-fp16/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - expr-voice-2-m
Speaker 1 - expr-voice-2-f
Speaker 2 - expr-voice-3-m
Speaker 3 - expr-voice-3-f
Speaker 4 - expr-voice-4-m
Speaker 5 - expr-voice-4-f
Speaker 6 - expr-voice-5-m
Speaker 7 - expr-voice-5-f
kitten-nano-en-v0_8-fp32
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-nano-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.8
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-fp32.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-fp32.tar.bz2
You can use the following code to play with kitten-nano-en-v0_8-fp32
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
voices="./kitten-nano-en-v0_8-fp32/voices.bin",
tokens="./kitten-nano-en-v0_8-fp32/tokens.txt",
data_dir="./kitten-nano-en-v0_8-fp32/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_8-fp32/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_8-fp32/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-nano-en-v0_8-fp32/model.fp32.onnx".into()),
voices: Some("./kitten-nano-en-v0_8-fp32/voices.bin".into()),
tokens: Some("./kitten-nano-en-v0_8-fp32/tokens.txt".into()),
data_dir: Some("./kitten-nano-en-v0_8-fp32/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-nano-en-v0_8-fp32 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-nano-en-v0_8-fp32/model.fp32.onnx',
voices: './kitten-nano-en-v0_8-fp32/voices.bin',
tokens: './kitten-nano-en-v0_8-fp32/tokens.txt',
dataDir: './kitten-nano-en-v0_8-fp32/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-nano-en-v0_8-fp32/model.fp32.onnx',
voices: './kitten-nano-en-v0_8-fp32/voices.bin',
tokens: './kitten-nano-en-v0_8-fp32/tokens.txt',
dataDir: './kitten-nano-en-v0_8-fp32/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
voices: "./kitten-nano-en-v0_8-fp32/voices.bin",
tokens: "./kitten-nano-en-v0_8-fp32/tokens.txt",
dataDir: "./kitten-nano-en-v0_8-fp32/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_8-fp32/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
voices = "./kitten-nano-en-v0_8-fp32/voices.bin",
tokens = "./kitten-nano-en-v0_8-fp32/tokens.txt",
dataDir = "./kitten-nano-en-v0_8-fp32/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-nano-en-v0_8-fp32/model.fp32.onnx");
kitten.setVoices("./kitten-nano-en-v0_8-fp32/voices.bin");
kitten.setTokens("./kitten-nano-en-v0_8-fp32/tokens.txt");
kitten.setDataDir("./kitten-nano-en-v0_8-fp32/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-nano-en-v0_8-fp32/model.fp32.onnx';
Config.Model.Kitten.Voices := './kitten-nano-en-v0_8-fp32/voices.bin';
Config.Model.Kitten.Tokens := './kitten-nano-en-v0_8-fp32/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-nano-en-v0_8-fp32/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-fp32 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-nano-en-v0_8-fp32/model.fp32.onnx",
Voices: "./kitten-nano-en-v0_8-fp32/voices.bin",
Tokens: "./kitten-nano-en-v0_8-fp32/tokens.txt",
DataDir: "./kitten-nano-en-v0_8-fp32/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - expr-voice-2-m
Speaker 1 - expr-voice-2-f
Speaker 2 - expr-voice-3-m
Speaker 3 - expr-voice-3-f
Speaker 4 - expr-voice-4-m
Speaker 5 - expr-voice-4-f
Speaker 6 - expr-voice-5-m
Speaker 7 - expr-voice-5-f
kitten-nano-en-v0_8-int8
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-nano-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-nano-0.8
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-int8.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_8-int8.tar.bz2
You can use the following code to play with kitten-nano-en-v0_8-int8
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-nano-en-v0_8-int8/model.int8.onnx",
voices="./kitten-nano-en-v0_8-int8/voices.bin",
tokens="./kitten-nano-en-v0_8-int8/tokens.txt",
data_dir="./kitten-nano-en-v0_8-int8/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-nano-en-v0_8-int8/model.int8.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_8-int8/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_8-int8/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_8-int8/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-nano-en-v0_8-int8/model.int8.onnx";
config.model.kitten.voices = "./kitten-nano-en-v0_8-int8/voices.bin";
config.model.kitten.tokens = "./kitten-nano-en-v0_8-int8/tokens.txt";
config.model.kitten.data_dir = "./kitten-nano-en-v0_8-int8/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-nano-en-v0_8-int8/model.int8.onnx".into()),
voices: Some("./kitten-nano-en-v0_8-int8/voices.bin".into()),
tokens: Some("./kitten-nano-en-v0_8-int8/tokens.txt".into()),
data_dir: Some("./kitten-nano-en-v0_8-int8/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-nano-en-v0_8-int8 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-nano-en-v0_8-int8/model.int8.onnx',
voices: './kitten-nano-en-v0_8-int8/voices.bin',
tokens: './kitten-nano-en-v0_8-int8/tokens.txt',
dataDir: './kitten-nano-en-v0_8-int8/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-nano-en-v0_8-int8/model.int8.onnx',
voices: './kitten-nano-en-v0_8-int8/voices.bin',
tokens: './kitten-nano-en-v0_8-int8/tokens.txt',
dataDir: './kitten-nano-en-v0_8-int8/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-nano-en-v0_8-int8/model.int8.onnx",
voices: "./kitten-nano-en-v0_8-int8/voices.bin",
tokens: "./kitten-nano-en-v0_8-int8/tokens.txt",
dataDir: "./kitten-nano-en-v0_8-int8/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-nano-en-v0_8-int8/model.int8.onnx";
config.Model.Kitten.Voices = "./kitten-nano-en-v0_8-int8/voices.bin";
config.Model.Kitten.Tokens = "./kitten-nano-en-v0_8-int8/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-nano-en-v0_8-int8/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-nano-en-v0_8-int8/model.int8.onnx",
voices = "./kitten-nano-en-v0_8-int8/voices.bin",
tokens = "./kitten-nano-en-v0_8-int8/tokens.txt",
dataDir = "./kitten-nano-en-v0_8-int8/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-nano-en-v0_8-int8/model.int8.onnx");
kitten.setVoices("./kitten-nano-en-v0_8-int8/voices.bin");
kitten.setTokens("./kitten-nano-en-v0_8-int8/tokens.txt");
kitten.setDataDir("./kitten-nano-en-v0_8-int8/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-nano-en-v0_8-int8/model.int8.onnx';
Config.Model.Kitten.Voices := './kitten-nano-en-v0_8-int8/voices.bin';
Config.Model.Kitten.Tokens := './kitten-nano-en-v0_8-int8/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-nano-en-v0_8-int8/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-nano-en-v0_8-int8 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-nano-en-v0_8-int8/model.int8.onnx",
Voices: "./kitten-nano-en-v0_8-int8/voices.bin",
Tokens: "./kitten-nano-en-v0_8-int8/tokens.txt",
DataDir: "./kitten-nano-en-v0_8-int8/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - expr-voice-2-m
Speaker 1 - expr-voice-2-f
Speaker 2 - expr-voice-3-m
Speaker 3 - expr-voice-3-f
Speaker 4 - expr-voice-4-m
Speaker 5 - expr-voice-4-f
Speaker 6 - expr-voice-5-m
Speaker 7 - expr-voice-5-f
kitten-micro-en-v0_8
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-micro-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-micro-0.8
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-micro-en-v0_8.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-micro-en-v0_8.tar.bz2
You can use the following code to play with kitten-micro-en-v0_8
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-micro-en-v0_8/model.onnx",
voices="./kitten-micro-en-v0_8/voices.bin",
tokens="./kitten-micro-en-v0_8/tokens.txt",
data_dir="./kitten-micro-en-v0_8/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-micro-en-v0_8/model.onnx";
config.model.kitten.voices = "./kitten-micro-en-v0_8/voices.bin";
config.model.kitten.tokens = "./kitten-micro-en-v0_8/tokens.txt";
config.model.kitten.data_dir = "./kitten-micro-en-v0_8/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-micro-en-v0_8/model.onnx";
config.model.kitten.voices = "./kitten-micro-en-v0_8/voices.bin";
config.model.kitten.tokens = "./kitten-micro-en-v0_8/tokens.txt";
config.model.kitten.data_dir = "./kitten-micro-en-v0_8/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-micro-en-v0_8/model.onnx".into()),
voices: Some("./kitten-micro-en-v0_8/voices.bin".into()),
tokens: Some("./kitten-micro-en-v0_8/tokens.txt".into()),
data_dir: Some("./kitten-micro-en-v0_8/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-micro-en-v0_8 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-micro-en-v0_8/model.onnx',
voices: './kitten-micro-en-v0_8/voices.bin',
tokens: './kitten-micro-en-v0_8/tokens.txt',
dataDir: './kitten-micro-en-v0_8/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-micro-en-v0_8/model.onnx',
voices: './kitten-micro-en-v0_8/voices.bin',
tokens: './kitten-micro-en-v0_8/tokens.txt',
dataDir: './kitten-micro-en-v0_8/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-micro-en-v0_8/model.onnx",
voices: "./kitten-micro-en-v0_8/voices.bin",
tokens: "./kitten-micro-en-v0_8/tokens.txt",
dataDir: "./kitten-micro-en-v0_8/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-micro-en-v0_8/model.onnx";
config.Model.Kitten.Voices = "./kitten-micro-en-v0_8/voices.bin";
config.Model.Kitten.Tokens = "./kitten-micro-en-v0_8/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-micro-en-v0_8/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-micro-en-v0_8/model.onnx",
voices = "./kitten-micro-en-v0_8/voices.bin",
tokens = "./kitten-micro-en-v0_8/tokens.txt",
dataDir = "./kitten-micro-en-v0_8/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-micro-en-v0_8/model.onnx");
kitten.setVoices("./kitten-micro-en-v0_8/voices.bin");
kitten.setTokens("./kitten-micro-en-v0_8/tokens.txt");
kitten.setDataDir("./kitten-micro-en-v0_8/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-micro-en-v0_8/model.onnx';
Config.Model.Kitten.Voices := './kitten-micro-en-v0_8/voices.bin';
Config.Model.Kitten.Tokens := './kitten-micro-en-v0_8/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-micro-en-v0_8/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-micro-en-v0_8 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-micro-en-v0_8/model.onnx",
Voices: "./kitten-micro-en-v0_8/voices.bin",
Tokens: "./kitten-micro-en-v0_8/tokens.txt",
DataDir: "./kitten-micro-en-v0_8/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - expr-voice-2-m
Speaker 1 - expr-voice-2-f
Speaker 2 - expr-voice-3-m
Speaker 3 - expr-voice-3-f
Speaker 4 - expr-voice-4-m
Speaker 5 - expr-voice-4-f
Speaker 6 - expr-voice-5-m
Speaker 7 - expr-voice-5-f
kitten-mini-en-v0_8
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-mini-0.8 and it is from https://huggingface.co/KittenML/kitten-tts-mini-0.8
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_8.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_8.tar.bz2
You can use the following code to play with kitten-mini-en-v0_8
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-mini-en-v0_8/model.onnx",
voices="./kitten-mini-en-v0_8/voices.bin",
tokens="./kitten-mini-en-v0_8/tokens.txt",
data_dir="./kitten-mini-en-v0_8/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-mini-en-v0_8/model.onnx";
config.model.kitten.voices = "./kitten-mini-en-v0_8/voices.bin";
config.model.kitten.tokens = "./kitten-mini-en-v0_8/tokens.txt";
config.model.kitten.data_dir = "./kitten-mini-en-v0_8/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-mini-en-v0_8/model.onnx";
config.model.kitten.voices = "./kitten-mini-en-v0_8/voices.bin";
config.model.kitten.tokens = "./kitten-mini-en-v0_8/tokens.txt";
config.model.kitten.data_dir = "./kitten-mini-en-v0_8/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kitten \
/tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-mini-en-v0_8/model.onnx".into()),
voices: Some("./kitten-mini-en-v0_8/voices.bin".into()),
tokens: Some("./kitten-mini-en-v0_8/tokens.txt".into()),
data_dir: Some("./kitten-mini-en-v0_8/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-mini-en-v0_8 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-mini-en-v0_8/model.onnx',
voices: './kitten-mini-en-v0_8/voices.bin',
tokens: './kitten-mini-en-v0_8/tokens.txt',
dataDir: './kitten-mini-en-v0_8/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-mini-en-v0_8/model.onnx',
voices: './kitten-mini-en-v0_8/voices.bin',
tokens: './kitten-mini-en-v0_8/tokens.txt',
dataDir: './kitten-mini-en-v0_8/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-mini-en-v0_8/model.onnx",
voices: "./kitten-mini-en-v0_8/voices.bin",
tokens: "./kitten-mini-en-v0_8/tokens.txt",
dataDir: "./kitten-mini-en-v0_8/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-mini-en-v0_8/model.onnx";
config.Model.Kitten.Voices = "./kitten-mini-en-v0_8/voices.bin";
config.Model.Kitten.Tokens = "./kitten-mini-en-v0_8/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-mini-en-v0_8/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-mini-en-v0_8/model.onnx",
voices = "./kitten-mini-en-v0_8/voices.bin",
tokens = "./kitten-mini-en-v0_8/tokens.txt",
dataDir = "./kitten-mini-en-v0_8/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-mini-en-v0_8/model.onnx");
kitten.setVoices("./kitten-mini-en-v0_8/voices.bin");
kitten.setTokens("./kitten-mini-en-v0_8/tokens.txt");
kitten.setDataDir("./kitten-mini-en-v0_8/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-mini-en-v0_8/model.onnx';
Config.Model.Kitten.Voices := './kitten-mini-en-v0_8/voices.bin';
Config.Model.Kitten.Tokens := './kitten-mini-en-v0_8/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-mini-en-v0_8/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-mini-en-v0_8 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-mini-en-v0_8/model.onnx",
Voices: "./kitten-mini-en-v0_8/voices.bin",
Tokens: "./kitten-mini-en-v0_8/tokens.txt",
DataDir: "./kitten-mini-en-v0_8/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - expr-voice-2-m
Speaker 1 - expr-voice-2-f
Speaker 2 - expr-voice-3-m
Speaker 3 - expr-voice-3-f
Speaker 4 - expr-voice-4-m
Speaker 5 - expr-voice-4-f
Speaker 6 - expr-voice-5-m
Speaker 7 - expr-voice-5-f
kokoro-en-v0_19
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kokoro v0.19 and it is from https://huggingface.co/hexgrad/kLegacy
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 11 | 24000 |
Meaning of speaker prefix
| Prefix | Meaning | sid range | Number of speakers |
|---|---|---|---|
| af | American female | 0 - 4 | 5 |
| am | American male | 5 - 6 | 2 |
| bf | British female | 7 - 8 | 2 |
| bm | British male | 9 - 10 | 2 |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 3 | 0 -> af | 1 -> af_bella | 2 -> af_nicole | 3 -> af_sarah |
| 4 - 7 | 4 -> af_sky | 5 -> am_adam | 6 -> am_michael | 7 -> bf_emma |
| 8 - 10 | 8 -> bf_isabella | 9 -> bm_george | 10 -> bm_lewis |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 3 | af -> 0 | af_bella -> 1 | af_nicole -> 2 | af_sarah -> 3 |
| 4 - 7 | af_sky -> 4 | am_adam -> 5 | am_michael -> 6 | bf_emma -> 7 |
| 8 - 10 | bf_isabella -> 8 | bm_george -> 9 | bm_lewis -> 10 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2
You can use the following code to play with kokoro-en-v0_19
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(
model="kokoro-en-v0_19/model.onnx",
voices="kokoro-en-v0_19/voices.bin",
tokens="kokoro-en-v0_19/tokens.txt",
data_dir="kokoro-en-v0_19/espeak-ng-data",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kokoro.model = "kokoro-en-v0_19/model.onnx";
config.model.kokoro.voices = "kokoro-en-v0_19/voices.bin";
config.model.kokoro.tokens = "kokoro-en-v0_19/tokens.txt";
config.model.kokoro.data_dir = "kokoro-en-v0_19/espeak-ng-data";
config.model.num_threads = 1;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 0;
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
#if 0
// If you don't want to use a callback, then please enable this branch
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
#else
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
#endif
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kokoro.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kokoro \
/tmp/test-kokoro.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kokoro.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kokoro.model = "kokoro-en-v0_19/model.onnx";
config.model.kokoro.voices = "kokoro-en-v0_19/voices.bin";
config.model.kokoro.tokens = "kokoro-en-v0_19/tokens.txt";
config.model.kokoro.data_dir = "kokoro-en-v0_19/espeak-ng-data";
config.model.num_threads = 1;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kokoro.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-kokoro \
/tmp/test-kokoro.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kokoro
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kokoro.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kokoro: OfflineTtsKokoroModelConfig {
model: Some("kokoro-en-v0_19/model.onnx".into()),
voices: Some("kokoro-en-v0_19/voices.bin".into()),
tokens: Some("kokoro-en-v0_19/tokens.txt".into()),
data_dir: Some("kokoro-en-v0_19/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kokoro-en-v0_19 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
kokoro: {
model: 'kokoro-en-v0_19/model.onnx',
voices: 'kokoro-en-v0_19/voices.bin',
tokens: 'kokoro-en-v0_19/tokens.txt',
dataDir: 'kokoro-en-v0_19/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(
model: 'kokoro-en-v0_19/model.onnx',
voices: 'kokoro-en-v0_19/voices.bin',
tokens: 'kokoro-en-v0_19/tokens.txt',
dataDir: 'kokoro-en-v0_19/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kokoro: kokoro,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with Swift API.
func run() {
let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(
model: "kokoro-en-v0_19/model.onnx",
voices: "kokoro-en-v0_19/voices.bin",
tokens: "kokoro-en-v0_19/tokens.txt",
dataDir: "kokoro-en-v0_19/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kokoro.Model = "kokoro-en-v0_19/model.onnx";
config.Model.Kokoro.Voices = "kokoro-en-v0_19/voices.bin";
config.Model.Kokoro.Tokens = "kokoro-en-v0_19/tokens.txt";
config.Model.Kokoro.DataDir = "kokoro-en-v0_19/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = ;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kokoro = OfflineTtsKokoroModelConfig(
model = "kokoro-en-v0_19/model.onnx",
voices = "kokoro-en-v0_19/voices.bin",
tokens = "kokoro-en-v0_19/tokens.txt",
dataDir = "kokoro-en-v0_19/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = ,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kokoro = new OfflineTtsKokoroModelConfig();
kokoro.setModel("kokoro-en-v0_19/model.onnx");
kokoro.setVoices("kokoro-en-v0_19/voices.bin");
kokoro.setTokens("kokoro-en-v0_19/tokens.txt");
kokoro.setDataDir("kokoro-en-v0_19/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKokoro(kokoro);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with Pascal API.
program test_kokoro;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kokoro.Model := 'kokoro-en-v0_19/model.onnx';
Config.Model.Kokoro.Voices := 'kokoro-en-v0_19/voices.bin';
Config.Model.Kokoro.Tokens := 'kokoro-en-v0_19/tokens.txt';
Config.Model.Kokoro.DataDir := 'kokoro-en-v0_19/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kokoro-en-v0_19 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kokoro: sherpa.OfflineTtsKokoroModelConfig{
Model: "kokoro-en-v0_19/model.onnx",
Voices: "kokoro-en-v0_19/voices.bin",
Tokens: "kokoro-en-v0_19/tokens.txt",
DataDir: "kokoro-en-v0_19/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0 - af
Speaker 1 - af_bella
Speaker 2 - af_nicole
Speaker 3 - af_sarah
Speaker 4 - af_sky
Speaker 5 - am_adam
Speaker 6 - am_michael
Speaker 7 - bf_emma
Speaker 8 - bf_isabella
Speaker 9 - bm_george
Speaker 10 - bm_lewis
supertonic-3-en
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for English (en).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "en"
audio = tts.generate("How are you doing today? This is a text-to-speech engine using next generation Kaldi.", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"en\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "en"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "en"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'How are you doing today? This is a text-to-speech engine using next generation Kaldi.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'en'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'en'},
);
final audio = tts.generateWithConfig(text: 'How are you doing today? This is a text-to-speech engine using next generation Kaldi.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "en"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"en\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "en"),
)
val audio = tts.generateWithConfigAndCallback(
text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "How are you doing today? This is a text-to-speech engine using next generation Kaldi.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"en\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "en"}';
Audio := Tts.GenerateWithConfig('How are you doing today? This is a text-to-speech engine using next generation Kaldi.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "How are you doing today? This is a text-to-speech engine using next generation Kaldi."
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "en"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 1
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 2
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 3
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 4
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 5
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 6
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 7
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 8
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
Speaker 9
0
Hello world.
1
How are you today?
2
The sky is blue.
3
I love machine learning.
4
Python is awesome.
5
Good morning everyone.
6
Artificial intelligence is growing.
7
Speech synthesis is fascinating.
8
Neural networks are powerful.
9
Text to speech converts text to audio.
10
The quick brown fox jumps over the lazy dog.
11
Machine learning enables computers to learn from data.
12
Natural language processing helps machines understand text.
13
Deep learning has revolutionized artificial intelligence.
14
Speech synthesis technology has advanced significantly.
15
Neural voice cloning can replicate speaking styles.
16
Text normalization is important for proper pronunciation.
17
Voice assistants help us interact with technology naturally.
18
Modern TTS systems use deep learning for high-quality speech.
19
Human computer interaction has become more intuitive.
vits-piper-en_GB-alan-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/alan/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alan-low.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx";
config.model.vits.tokens = "vits-piper-en_GB-alan-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-alan-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-alan-low.tar.bz2
You can use the following code to play with vits-piper-en_GB-alan-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
data_dir="vits-piper-en_GB-alan-low/espeak-ng-data",
tokens="vits-piper-en_GB-alan-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx";
config.model.vits.tokens = "vits-piper-en_GB-alan-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-alan-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-alan-low/en_GB-alan-low.onnx".into()),
tokens: Some("vits-piper-en_GB-alan-low/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-alan-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-alan-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-alan-low/en_GB-alan-low.onnx',
tokens: 'vits-piper-en_GB-alan-low/tokens.txt',
dataDir: 'vits-piper-en_GB-alan-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-alan-low/en_GB-alan-low.onnx',
tokens: 'vits-piper-en_GB-alan-low/tokens.txt',
dataDir: 'vits-piper-en_GB-alan-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-alan-low/tokens.txt",
dataDir: "vits-piper-en_GB-alan-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-alan-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-alan-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
tokens = "vits-piper-en_GB-alan-low/tokens.txt",
dataDir = "vits-piper-en_GB-alan-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-alan-low/en_GB-alan-low.onnx");
vits.setTokens("vits-piper-en_GB-alan-low/tokens.txt");
vits.setDataDir("vits-piper-en_GB-alan-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-alan-low/en_GB-alan-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-alan-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-alan-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-alan-low/en_GB-alan-low.onnx",
Tokens: "vits-piper-en_GB-alan-low/tokens.txt",
DataDir: "vits-piper-en_GB-alan-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-alan-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/alan/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-alan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-alan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-alan-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
data_dir="vits-piper-en_GB-alan-medium/espeak-ng-data",
tokens="vits-piper-en_GB-alan-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-alan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-alan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-alan-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-alan-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-alan-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx',
tokens: 'vits-piper-en_GB-alan-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-alan-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx',
tokens: 'vits-piper-en_GB-alan-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-alan-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-alan-medium/tokens.txt",
dataDir: "vits-piper-en_GB-alan-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-alan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-alan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
tokens = "vits-piper-en_GB-alan-medium/tokens.txt",
dataDir = "vits-piper-en_GB-alan-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx");
vits.setTokens("vits-piper-en_GB-alan-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-alan-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-alan-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-alan-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-alan-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-alan-medium/en_GB-alan-medium.onnx",
Tokens: "vits-piper-en_GB-alan-medium/tokens.txt",
DataDir: "vits-piper-en_GB-alan-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-alba-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/alba/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-alba-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-alba-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-alba-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
data_dir="vits-piper-en_GB-alba-medium/espeak-ng-data",
tokens="vits-piper-en_GB-alba-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-alba-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-alba-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-alba-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-alba-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-alba-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx',
tokens: 'vits-piper-en_GB-alba-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-alba-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx',
tokens: 'vits-piper-en_GB-alba-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-alba-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-alba-medium/tokens.txt",
dataDir: "vits-piper-en_GB-alba-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-alba-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-alba-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
tokens = "vits-piper-en_GB-alba-medium/tokens.txt",
dataDir = "vits-piper-en_GB-alba-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx");
vits.setTokens("vits-piper-en_GB-alba-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-alba-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-alba-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-alba-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-alba-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-alba-medium/en_GB-alba-medium.onnx",
Tokens: "vits-piper-en_GB-alba-medium/tokens.txt",
DataDir: "vits-piper-en_GB-alba-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-aru-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/aru/medium
| Number of speakers | Sample rate |
|---|---|
| 12 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-aru-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-aru-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-aru-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
data_dir="vits-piper-en_GB-aru-medium/espeak-ng-data",
tokens="vits-piper-en_GB-aru-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-aru-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-aru-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-aru-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-aru-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-aru-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx',
tokens: 'vits-piper-en_GB-aru-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-aru-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx',
tokens: 'vits-piper-en_GB-aru-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-aru-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-aru-medium/tokens.txt",
dataDir: "vits-piper-en_GB-aru-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-aru-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-aru-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
tokens = "vits-piper-en_GB-aru-medium/tokens.txt",
dataDir = "vits-piper-en_GB-aru-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx");
vits.setTokens("vits-piper-en_GB-aru-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-aru-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-aru-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-aru-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-aru-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-aru-medium/en_GB-aru-medium.onnx",
Tokens: "vits-piper-en_GB-aru-medium/tokens.txt",
DataDir: "vits-piper-en_GB-aru-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
vits-piper-en_GB-cori-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/cori/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx";
config.model.vits.tokens = "vits-piper-en_GB-cori-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-cori-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-cori-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
data_dir="vits-piper-en_GB-cori-high/espeak-ng-data",
tokens="vits-piper-en_GB-cori-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx";
config.model.vits.tokens = "vits-piper-en_GB-cori-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-cori-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-cori-high/en_GB-cori-high.onnx".into()),
tokens: Some("vits-piper-en_GB-cori-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-cori-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-cori-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-cori-high/en_GB-cori-high.onnx',
tokens: 'vits-piper-en_GB-cori-high/tokens.txt',
dataDir: 'vits-piper-en_GB-cori-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-cori-high/en_GB-cori-high.onnx',
tokens: 'vits-piper-en_GB-cori-high/tokens.txt',
dataDir: 'vits-piper-en_GB-cori-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-cori-high/tokens.txt",
dataDir: "vits-piper-en_GB-cori-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-cori-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-cori-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
tokens = "vits-piper-en_GB-cori-high/tokens.txt",
dataDir = "vits-piper-en_GB-cori-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-cori-high/en_GB-cori-high.onnx");
vits.setTokens("vits-piper-en_GB-cori-high/tokens.txt");
vits.setDataDir("vits-piper-en_GB-cori-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-cori-high/en_GB-cori-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-cori-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-cori-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-cori-high/en_GB-cori-high.onnx",
Tokens: "vits-piper-en_GB-cori-high/tokens.txt",
DataDir: "vits-piper-en_GB-cori-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-cori-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/cori/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-cori-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-cori-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-cori-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
data_dir="vits-piper-en_GB-cori-medium/espeak-ng-data",
tokens="vits-piper-en_GB-cori-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-cori-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-cori-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-cori-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-cori-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-cori-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx',
tokens: 'vits-piper-en_GB-cori-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-cori-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx',
tokens: 'vits-piper-en_GB-cori-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-cori-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-cori-medium/tokens.txt",
dataDir: "vits-piper-en_GB-cori-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-cori-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-cori-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
tokens = "vits-piper-en_GB-cori-medium/tokens.txt",
dataDir = "vits-piper-en_GB-cori-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx");
vits.setTokens("vits-piper-en_GB-cori-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-cori-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-cori-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-cori-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-cori-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx",
Tokens: "vits-piper-en_GB-cori-medium/tokens.txt",
DataDir: "vits-piper-en_GB-cori-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-dii-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_en-GB_dii
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-dii-high.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx";
config.model.vits.tokens = "vits-piper-en_GB-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-dii-high.tar.bz2
You can use the following code to play with vits-piper-en_GB-dii-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
data_dir="vits-piper-en_GB-dii-high/espeak-ng-data",
tokens="vits-piper-en_GB-dii-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx";
config.model.vits.tokens = "vits-piper-en_GB-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-dii-high/en_GB-dii-high.onnx".into()),
tokens: Some("vits-piper-en_GB-dii-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-dii-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-dii-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-dii-high/en_GB-dii-high.onnx',
tokens: 'vits-piper-en_GB-dii-high/tokens.txt',
dataDir: 'vits-piper-en_GB-dii-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-dii-high/en_GB-dii-high.onnx',
tokens: 'vits-piper-en_GB-dii-high/tokens.txt',
dataDir: 'vits-piper-en_GB-dii-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-dii-high/tokens.txt",
dataDir: "vits-piper-en_GB-dii-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
tokens = "vits-piper-en_GB-dii-high/tokens.txt",
dataDir = "vits-piper-en_GB-dii-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-dii-high/en_GB-dii-high.onnx");
vits.setTokens("vits-piper-en_GB-dii-high/tokens.txt");
vits.setDataDir("vits-piper-en_GB-dii-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-dii-high/en_GB-dii-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-dii-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-dii-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-dii-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-dii-high/en_GB-dii-high.onnx",
Tokens: "vits-piper-en_GB-dii-high/tokens.txt",
DataDir: "vits-piper-en_GB-dii-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-jenny_dioco-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/jenny_dioco/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
data_dir="vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data",
tokens="vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-jenny_dioco-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx',
tokens: 'vits-piper-en_GB-jenny_dioco-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx',
tokens: 'vits-piper-en_GB-jenny_dioco-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
dataDir: "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
tokens = "vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
dataDir = "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx");
vits.setTokens("vits-piper-en_GB-jenny_dioco-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-jenny_dioco-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-jenny_dioco-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-jenny_dioco-medium/en_GB-jenny_dioco-medium.onnx",
Tokens: "vits-piper-en_GB-jenny_dioco-medium/tokens.txt",
DataDir: "vits-piper-en_GB-jenny_dioco-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_en-GB_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx";
config.model.vits.tokens = "vits-piper-en_GB-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
data_dir="vits-piper-en_GB-miro-high/espeak-ng-data",
tokens="vits-piper-en_GB-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx";
config.model.vits.tokens = "vits-piper-en_GB-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-miro-high/en_GB-miro-high.onnx".into()),
tokens: Some("vits-piper-en_GB-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-miro-high/en_GB-miro-high.onnx',
tokens: 'vits-piper-en_GB-miro-high/tokens.txt',
dataDir: 'vits-piper-en_GB-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-miro-high/en_GB-miro-high.onnx',
tokens: 'vits-piper-en_GB-miro-high/tokens.txt',
dataDir: 'vits-piper-en_GB-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-miro-high/tokens.txt",
dataDir: "vits-piper-en_GB-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
tokens = "vits-piper-en_GB-miro-high/tokens.txt",
dataDir = "vits-piper-en_GB-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-miro-high/en_GB-miro-high.onnx");
vits.setTokens("vits-piper-en_GB-miro-high/tokens.txt");
vits.setDataDir("vits-piper-en_GB-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-miro-high/en_GB-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-miro-high/en_GB-miro-high.onnx",
Tokens: "vits-piper-en_GB-miro-high/tokens.txt",
DataDir: "vits-piper-en_GB-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-northern_english_male-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/northern_english_male/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
data_dir="vits-piper-en_GB-northern_english_male-medium/espeak-ng-data",
tokens="vits-piper-en_GB-northern_english_male-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-northern_english_male-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-northern_english_male-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx',
tokens: 'vits-piper-en_GB-northern_english_male-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-northern_english_male-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx',
tokens: 'vits-piper-en_GB-northern_english_male-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-northern_english_male-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-northern_english_male-medium/tokens.txt",
dataDir: "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
tokens = "vits-piper-en_GB-northern_english_male-medium/tokens.txt",
dataDir = "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx");
vits.setTokens("vits-piper-en_GB-northern_english_male-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-northern_english_male-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-northern_english_male-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-northern_english_male-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-northern_english_male-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-northern_english_male-medium/en_GB-northern_english_male-medium.onnx",
Tokens: "vits-piper-en_GB-northern_english_male-medium/tokens.txt",
DataDir: "vits-piper-en_GB-northern_english_male-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-semaine-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/semaine/medium
| Number of speakers | Sample rate |
|---|---|
| 4 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-semaine-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-semaine-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-semaine-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
data_dir="vits-piper-en_GB-semaine-medium/espeak-ng-data",
tokens="vits-piper-en_GB-semaine-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-semaine-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-semaine-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-semaine-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-semaine-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-semaine-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx',
tokens: 'vits-piper-en_GB-semaine-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-semaine-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx',
tokens: 'vits-piper-en_GB-semaine-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-semaine-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-semaine-medium/tokens.txt",
dataDir: "vits-piper-en_GB-semaine-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-semaine-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-semaine-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
tokens = "vits-piper-en_GB-semaine-medium/tokens.txt",
dataDir = "vits-piper-en_GB-semaine-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx");
vits.setTokens("vits-piper-en_GB-semaine-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-semaine-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-semaine-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-semaine-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-semaine-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-semaine-medium/en_GB-semaine-medium.onnx",
Tokens: "vits-piper-en_GB-semaine-medium/tokens.txt",
DataDir: "vits-piper-en_GB-semaine-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
vits-piper-en_GB-southern_english_female-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/southern_english_female/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx";
config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-southern_english_female-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
data_dir="vits-piper-en_GB-southern_english_female-low/espeak-ng-data",
tokens="vits-piper-en_GB-southern_english_female-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx";
config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx".into()),
tokens: Some("vits-piper-en_GB-southern_english_female-low/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-southern_english_female-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx',
tokens: 'vits-piper-en_GB-southern_english_female-low/tokens.txt',
dataDir: 'vits-piper-en_GB-southern_english_female-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx',
tokens: 'vits-piper-en_GB-southern_english_female-low/tokens.txt',
dataDir: 'vits-piper-en_GB-southern_english_female-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-southern_english_female-low/tokens.txt",
dataDir: "vits-piper-en_GB-southern_english_female-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
tokens = "vits-piper-en_GB-southern_english_female-low/tokens.txt",
dataDir = "vits-piper-en_GB-southern_english_female-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx");
vits.setTokens("vits-piper-en_GB-southern_english_female-low/tokens.txt");
vits.setDataDir("vits-piper-en_GB-southern_english_female-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-southern_english_female-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-southern_english_female-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-southern_english_female-low/en_GB-southern_english_female-low.onnx",
Tokens: "vits-piper-en_GB-southern_english_female-low/tokens.txt",
DataDir: "vits-piper-en_GB-southern_english_female-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_GB-southern_english_female-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_female-medium
| Number of speakers | Sample rate |
|---|---|
| 6 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
data_dir="vits-piper-en_GB-southern_english_female-medium/espeak-ng-data",
tokens="vits-piper-en_GB-southern_english_female-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-southern_english_female-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-southern_english_female-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx',
tokens: 'vits-piper-en_GB-southern_english_female-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-southern_english_female-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx',
tokens: 'vits-piper-en_GB-southern_english_female-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-southern_english_female-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-southern_english_female-medium/tokens.txt",
dataDir: "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
tokens = "vits-piper-en_GB-southern_english_female-medium/tokens.txt",
dataDir = "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx");
vits.setTokens("vits-piper-en_GB-southern_english_female-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-southern_english_female-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-southern_english_female-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-southern_english_female-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_female-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-southern_english_female-medium/en_GB-southern_english_female-medium.onnx",
Tokens: "vits-piper-en_GB-southern_english_female-medium/tokens.txt",
DataDir: "vits-piper-en_GB-southern_english_female-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
vits-piper-en_GB-southern_english_male-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_male-medium
| Number of speakers | Sample rate |
|---|---|
| 8 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
data_dir="vits-piper-en_GB-southern_english_male-medium/espeak-ng-data",
tokens="vits-piper-en_GB-southern_english_male-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-southern_english_male-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-southern_english_male-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx',
tokens: 'vits-piper-en_GB-southern_english_male-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-southern_english_male-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx',
tokens: 'vits-piper-en_GB-southern_english_male-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-southern_english_male-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-southern_english_male-medium/tokens.txt",
dataDir: "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
tokens = "vits-piper-en_GB-southern_english_male-medium/tokens.txt",
dataDir = "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx");
vits.setTokens("vits-piper-en_GB-southern_english_male-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-southern_english_male-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-southern_english_male-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-southern_english_male-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-southern_english_male-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-southern_english_male-medium/en_GB-southern_english_male-medium.onnx",
Tokens: "vits-piper-en_GB-southern_english_male-medium/tokens.txt",
DataDir: "vits-piper-en_GB-southern_english_male-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
vits-piper-en_GB-vctk-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_GB/vctk/medium
| Number of speakers | Sample rate |
|---|---|
| 109 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-vctk-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-vctk-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_GB-vctk-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
data_dir="vits-piper-en_GB-vctk-medium/espeak-ng-data",
tokens="vits-piper-en_GB-vctk-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx";
config.model.vits.tokens = "vits-piper-en_GB-vctk-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_GB-vctk-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx".into()),
tokens: Some("vits-piper-en_GB-vctk-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_GB-vctk-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_GB-vctk-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx',
tokens: 'vits-piper-en_GB-vctk-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-vctk-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx',
tokens: 'vits-piper-en_GB-vctk-medium/tokens.txt',
dataDir: 'vits-piper-en_GB-vctk-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_GB-vctk-medium/tokens.txt",
dataDir: "vits-piper-en_GB-vctk-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_GB-vctk-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_GB-vctk-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
tokens = "vits-piper-en_GB-vctk-medium/tokens.txt",
dataDir = "vits-piper-en_GB-vctk-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx");
vits.setTokens("vits-piper-en_GB-vctk-medium/tokens.txt");
vits.setDataDir("vits-piper-en_GB-vctk-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_GB-vctk-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_GB-vctk-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_GB-vctk-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_GB-vctk-medium/en_GB-vctk-medium.onnx",
Tokens: "vits-piper-en_GB-vctk-medium/tokens.txt",
DataDir: "vits-piper-en_GB-vctk-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
Speaker 18
Speaker 19
Speaker 20
Speaker 21
Speaker 22
Speaker 23
Speaker 24
Speaker 25
Speaker 26
Speaker 27
Speaker 28
Speaker 29
Speaker 30
Speaker 31
Speaker 32
Speaker 33
Speaker 34
Speaker 35
Speaker 36
Speaker 37
Speaker 38
Speaker 39
Speaker 40
Speaker 41
Speaker 42
Speaker 43
Speaker 44
Speaker 45
Speaker 46
Speaker 47
Speaker 48
Speaker 49
Speaker 50
Speaker 51
Speaker 52
Speaker 53
Speaker 54
Speaker 55
Speaker 56
Speaker 57
Speaker 58
Speaker 59
Speaker 60
Speaker 61
Speaker 62
Speaker 63
Speaker 64
Speaker 65
Speaker 66
Speaker 67
Speaker 68
Speaker 69
Speaker 70
Speaker 71
Speaker 72
Speaker 73
Speaker 74
Speaker 75
Speaker 76
Speaker 77
Speaker 78
Speaker 79
Speaker 80
Speaker 81
Speaker 82
Speaker 83
Speaker 84
Speaker 85
Speaker 86
Speaker 87
Speaker 88
Speaker 89
Speaker 90
Speaker 91
Speaker 92
Speaker 93
Speaker 94
Speaker 95
Speaker 96
Speaker 97
Speaker 98
Speaker 99
Speaker 100
Speaker 101
Speaker 102
Speaker 103
Speaker 104
Speaker 105
Speaker 106
Speaker 107
Speaker 108
vits-piper-en_US-amy-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/amy/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-amy-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-amy-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2
You can use the following code to play with vits-piper-en_US-amy-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-amy-low/en_US-amy-low.onnx",
data_dir="vits-piper-en_US-amy-low/espeak-ng-data",
tokens="vits-piper-en_US-amy-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-amy-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-amy-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-amy-low/en_US-amy-low.onnx".into()),
tokens: Some("vits-piper-en_US-amy-low/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-amy-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-amy-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-amy-low/en_US-amy-low.onnx',
tokens: 'vits-piper-en_US-amy-low/tokens.txt',
dataDir: 'vits-piper-en_US-amy-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-amy-low/en_US-amy-low.onnx',
tokens: 'vits-piper-en_US-amy-low/tokens.txt',
dataDir: 'vits-piper-en_US-amy-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-amy-low/en_US-amy-low.onnx",
lexicon: "",
tokens: "vits-piper-en_US-amy-low/tokens.txt",
dataDir: "vits-piper-en_US-amy-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-amy-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-amy-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-amy-low/en_US-amy-low.onnx",
tokens = "vits-piper-en_US-amy-low/tokens.txt",
dataDir = "vits-piper-en_US-amy-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-amy-low/en_US-amy-low.onnx");
vits.setTokens("vits-piper-en_US-amy-low/tokens.txt");
vits.setDataDir("vits-piper-en_US-amy-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-amy-low/en_US-amy-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-amy-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-amy-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-amy-low/en_US-amy-low.onnx",
Tokens: "vits-piper-en_US-amy-low/tokens.txt",
DataDir: "vits-piper-en_US-amy-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-amy-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/amy/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-amy-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-amy-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-amy-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
data_dir="vits-piper-en_US-amy-medium/espeak-ng-data",
tokens="vits-piper-en_US-amy-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-amy-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-amy-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-amy-medium/en_US-amy-medium.onnx".into()),
tokens: Some("vits-piper-en_US-amy-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-amy-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-amy-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-amy-medium/en_US-amy-medium.onnx',
tokens: 'vits-piper-en_US-amy-medium/tokens.txt',
dataDir: 'vits-piper-en_US-amy-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-amy-medium/en_US-amy-medium.onnx',
tokens: 'vits-piper-en_US-amy-medium/tokens.txt',
dataDir: 'vits-piper-en_US-amy-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-amy-medium/tokens.txt",
dataDir: "vits-piper-en_US-amy-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-amy-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-amy-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
tokens = "vits-piper-en_US-amy-medium/tokens.txt",
dataDir = "vits-piper-en_US-amy-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-amy-medium/en_US-amy-medium.onnx");
vits.setTokens("vits-piper-en_US-amy-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-amy-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-amy-medium/en_US-amy-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-amy-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-amy-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-amy-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-amy-medium/en_US-amy-medium.onnx",
Tokens: "vits-piper-en_US-amy-medium/tokens.txt",
DataDir: "vits-piper-en_US-amy-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-arctic-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/arctic/medium
| Number of speakers | Sample rate |
|---|---|
| 18 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-arctic-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-arctic-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-arctic-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
data_dir="vits-piper-en_US-arctic-medium/espeak-ng-data",
tokens="vits-piper-en_US-arctic-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-arctic-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-arctic-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx".into()),
tokens: Some("vits-piper-en_US-arctic-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-arctic-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-arctic-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx',
tokens: 'vits-piper-en_US-arctic-medium/tokens.txt',
dataDir: 'vits-piper-en_US-arctic-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx',
tokens: 'vits-piper-en_US-arctic-medium/tokens.txt',
dataDir: 'vits-piper-en_US-arctic-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-arctic-medium/tokens.txt",
dataDir: "vits-piper-en_US-arctic-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-arctic-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-arctic-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
tokens = "vits-piper-en_US-arctic-medium/tokens.txt",
dataDir = "vits-piper-en_US-arctic-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx");
vits.setTokens("vits-piper-en_US-arctic-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-arctic-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-arctic-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-arctic-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-arctic-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-arctic-medium/en_US-arctic-medium.onnx",
Tokens: "vits-piper-en_US-arctic-medium/tokens.txt",
DataDir: "vits-piper-en_US-arctic-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
vits-piper-en_US-bryce-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/bryce/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-bryce-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-bryce-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-bryce-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
data_dir="vits-piper-en_US-bryce-medium/espeak-ng-data",
tokens="vits-piper-en_US-bryce-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-bryce-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-bryce-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx".into()),
tokens: Some("vits-piper-en_US-bryce-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-bryce-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-bryce-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx',
tokens: 'vits-piper-en_US-bryce-medium/tokens.txt',
dataDir: 'vits-piper-en_US-bryce-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx',
tokens: 'vits-piper-en_US-bryce-medium/tokens.txt',
dataDir: 'vits-piper-en_US-bryce-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-bryce-medium/tokens.txt",
dataDir: "vits-piper-en_US-bryce-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-bryce-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-bryce-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
tokens = "vits-piper-en_US-bryce-medium/tokens.txt",
dataDir = "vits-piper-en_US-bryce-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx");
vits.setTokens("vits-piper-en_US-bryce-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-bryce-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-bryce-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-bryce-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-bryce-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-bryce-medium/en_US-bryce-medium.onnx",
Tokens: "vits-piper-en_US-bryce-medium/tokens.txt",
DataDir: "vits-piper-en_US-bryce-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-danny-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/danny/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-danny-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-danny-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-danny-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-danny-low/en_US-danny-low.onnx",
data_dir="vits-piper-en_US-danny-low/espeak-ng-data",
tokens="vits-piper-en_US-danny-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-danny-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-danny-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-danny-low/en_US-danny-low.onnx".into()),
tokens: Some("vits-piper-en_US-danny-low/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-danny-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-danny-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-danny-low/en_US-danny-low.onnx',
tokens: 'vits-piper-en_US-danny-low/tokens.txt',
dataDir: 'vits-piper-en_US-danny-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-danny-low/en_US-danny-low.onnx',
tokens: 'vits-piper-en_US-danny-low/tokens.txt',
dataDir: 'vits-piper-en_US-danny-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-danny-low/en_US-danny-low.onnx",
lexicon: "",
tokens: "vits-piper-en_US-danny-low/tokens.txt",
dataDir: "vits-piper-en_US-danny-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-danny-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-danny-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-danny-low/en_US-danny-low.onnx",
tokens = "vits-piper-en_US-danny-low/tokens.txt",
dataDir = "vits-piper-en_US-danny-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-danny-low/en_US-danny-low.onnx");
vits.setTokens("vits-piper-en_US-danny-low/tokens.txt");
vits.setDataDir("vits-piper-en_US-danny-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-danny-low/en_US-danny-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-danny-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-danny-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-danny-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-danny-low/en_US-danny-low.onnx",
Tokens: "vits-piper-en_US-danny-low/tokens.txt",
DataDir: "vits-piper-en_US-danny-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-glados-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://github.com/rhasspy/piper/issues/187#issuecomment-1805709037
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-glados-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-glados-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-glados-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-glados-high/en_US-glados-high.onnx",
data_dir="vits-piper-en_US-glados-high/espeak-ng-data",
tokens="vits-piper-en_US-glados-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-glados-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-glados-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-glados-high/en_US-glados-high.onnx".into()),
tokens: Some("vits-piper-en_US-glados-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-glados-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-glados-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-glados-high/en_US-glados-high.onnx',
tokens: 'vits-piper-en_US-glados-high/tokens.txt',
dataDir: 'vits-piper-en_US-glados-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-glados-high/en_US-glados-high.onnx',
tokens: 'vits-piper-en_US-glados-high/tokens.txt',
dataDir: 'vits-piper-en_US-glados-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-glados-high/en_US-glados-high.onnx",
lexicon: "",
tokens: "vits-piper-en_US-glados-high/tokens.txt",
dataDir: "vits-piper-en_US-glados-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-glados-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-glados-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-glados-high/en_US-glados-high.onnx",
tokens = "vits-piper-en_US-glados-high/tokens.txt",
dataDir = "vits-piper-en_US-glados-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-glados-high/en_US-glados-high.onnx");
vits.setTokens("vits-piper-en_US-glados-high/tokens.txt");
vits.setDataDir("vits-piper-en_US-glados-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-glados-high/en_US-glados-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-glados-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-glados-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-glados-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-glados-high/en_US-glados-high.onnx",
Tokens: "vits-piper-en_US-glados-high/tokens.txt",
DataDir: "vits-piper-en_US-glados-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-hfc_female-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/hfc_female/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-hfc_female-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
data_dir="vits-piper-en_US-hfc_female-medium/espeak-ng-data",
tokens="vits-piper-en_US-hfc_female-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx".into()),
tokens: Some("vits-piper-en_US-hfc_female-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-hfc_female-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-hfc_female-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx',
tokens: 'vits-piper-en_US-hfc_female-medium/tokens.txt',
dataDir: 'vits-piper-en_US-hfc_female-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx',
tokens: 'vits-piper-en_US-hfc_female-medium/tokens.txt',
dataDir: 'vits-piper-en_US-hfc_female-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-hfc_female-medium/tokens.txt",
dataDir: "vits-piper-en_US-hfc_female-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
tokens = "vits-piper-en_US-hfc_female-medium/tokens.txt",
dataDir = "vits-piper-en_US-hfc_female-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx");
vits.setTokens("vits-piper-en_US-hfc_female-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-hfc_female-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-hfc_female-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-hfc_female-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_female-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-hfc_female-medium/en_US-hfc_female-medium.onnx",
Tokens: "vits-piper-en_US-hfc_female-medium/tokens.txt",
DataDir: "vits-piper-en_US-hfc_female-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-hfc_male-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/hfc_male/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-hfc_male-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
data_dir="vits-piper-en_US-hfc_male-medium/espeak-ng-data",
tokens="vits-piper-en_US-hfc_male-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx".into()),
tokens: Some("vits-piper-en_US-hfc_male-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-hfc_male-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-hfc_male-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx',
tokens: 'vits-piper-en_US-hfc_male-medium/tokens.txt',
dataDir: 'vits-piper-en_US-hfc_male-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx',
tokens: 'vits-piper-en_US-hfc_male-medium/tokens.txt',
dataDir: 'vits-piper-en_US-hfc_male-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-hfc_male-medium/tokens.txt",
dataDir: "vits-piper-en_US-hfc_male-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
tokens = "vits-piper-en_US-hfc_male-medium/tokens.txt",
dataDir = "vits-piper-en_US-hfc_male-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx");
vits.setTokens("vits-piper-en_US-hfc_male-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-hfc_male-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-hfc_male-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-hfc_male-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-hfc_male-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-hfc_male-medium/en_US-hfc_male-medium.onnx",
Tokens: "vits-piper-en_US-hfc_male-medium/tokens.txt",
DataDir: "vits-piper-en_US-hfc_male-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-joe-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/joe/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-joe-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-joe-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-joe-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
data_dir="vits-piper-en_US-joe-medium/espeak-ng-data",
tokens="vits-piper-en_US-joe-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-joe-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-joe-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-joe-medium/en_US-joe-medium.onnx".into()),
tokens: Some("vits-piper-en_US-joe-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-joe-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-joe-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-joe-medium/en_US-joe-medium.onnx',
tokens: 'vits-piper-en_US-joe-medium/tokens.txt',
dataDir: 'vits-piper-en_US-joe-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-joe-medium/en_US-joe-medium.onnx',
tokens: 'vits-piper-en_US-joe-medium/tokens.txt',
dataDir: 'vits-piper-en_US-joe-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-joe-medium/tokens.txt",
dataDir: "vits-piper-en_US-joe-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-joe-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-joe-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
tokens = "vits-piper-en_US-joe-medium/tokens.txt",
dataDir = "vits-piper-en_US-joe-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-joe-medium/en_US-joe-medium.onnx");
vits.setTokens("vits-piper-en_US-joe-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-joe-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-joe-medium/en_US-joe-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-joe-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-joe-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-joe-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-joe-medium/en_US-joe-medium.onnx",
Tokens: "vits-piper-en_US-joe-medium/tokens.txt",
DataDir: "vits-piper-en_US-joe-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-john-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/john/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-john-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-john-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-john-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-john-medium/en_US-john-medium.onnx",
data_dir="vits-piper-en_US-john-medium/espeak-ng-data",
tokens="vits-piper-en_US-john-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-john-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-john-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-john-medium/en_US-john-medium.onnx".into()),
tokens: Some("vits-piper-en_US-john-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-john-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-john-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-john-medium/en_US-john-medium.onnx',
tokens: 'vits-piper-en_US-john-medium/tokens.txt',
dataDir: 'vits-piper-en_US-john-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-john-medium/en_US-john-medium.onnx',
tokens: 'vits-piper-en_US-john-medium/tokens.txt',
dataDir: 'vits-piper-en_US-john-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-john-medium/en_US-john-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-john-medium/tokens.txt",
dataDir: "vits-piper-en_US-john-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-john-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-john-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-john-medium/en_US-john-medium.onnx",
tokens = "vits-piper-en_US-john-medium/tokens.txt",
dataDir = "vits-piper-en_US-john-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-john-medium/en_US-john-medium.onnx");
vits.setTokens("vits-piper-en_US-john-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-john-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-john-medium/en_US-john-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-john-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-john-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-john-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-john-medium/en_US-john-medium.onnx",
Tokens: "vits-piper-en_US-john-medium/tokens.txt",
DataDir: "vits-piper-en_US-john-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-kathleen-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/kathleen/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-kathleen-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-kathleen-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-kathleen-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
data_dir="vits-piper-en_US-kathleen-low/espeak-ng-data",
tokens="vits-piper-en_US-kathleen-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-kathleen-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-kathleen-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx".into()),
tokens: Some("vits-piper-en_US-kathleen-low/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-kathleen-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-kathleen-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx',
tokens: 'vits-piper-en_US-kathleen-low/tokens.txt',
dataDir: 'vits-piper-en_US-kathleen-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx',
tokens: 'vits-piper-en_US-kathleen-low/tokens.txt',
dataDir: 'vits-piper-en_US-kathleen-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
lexicon: "",
tokens: "vits-piper-en_US-kathleen-low/tokens.txt",
dataDir: "vits-piper-en_US-kathleen-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-kathleen-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-kathleen-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
tokens = "vits-piper-en_US-kathleen-low/tokens.txt",
dataDir = "vits-piper-en_US-kathleen-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx");
vits.setTokens("vits-piper-en_US-kathleen-low/tokens.txt");
vits.setDataDir("vits-piper-en_US-kathleen-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-kathleen-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-kathleen-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-kathleen-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-kathleen-low/en_US-kathleen-low.onnx",
Tokens: "vits-piper-en_US-kathleen-low/tokens.txt",
DataDir: "vits-piper-en_US-kathleen-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-kristin-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/kristin/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-kristin-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-kristin-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-kristin-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
data_dir="vits-piper-en_US-kristin-medium/espeak-ng-data",
tokens="vits-piper-en_US-kristin-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-kristin-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-kristin-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx".into()),
tokens: Some("vits-piper-en_US-kristin-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-kristin-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-kristin-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx',
tokens: 'vits-piper-en_US-kristin-medium/tokens.txt',
dataDir: 'vits-piper-en_US-kristin-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx',
tokens: 'vits-piper-en_US-kristin-medium/tokens.txt',
dataDir: 'vits-piper-en_US-kristin-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-kristin-medium/tokens.txt",
dataDir: "vits-piper-en_US-kristin-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-kristin-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-kristin-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
tokens = "vits-piper-en_US-kristin-medium/tokens.txt",
dataDir = "vits-piper-en_US-kristin-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx");
vits.setTokens("vits-piper-en_US-kristin-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-kristin-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-kristin-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-kristin-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-kristin-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-kristin-medium/en_US-kristin-medium.onnx",
Tokens: "vits-piper-en_US-kristin-medium/tokens.txt",
DataDir: "vits-piper-en_US-kristin-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-kusal-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/kusal/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-kusal-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-kusal-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-kusal-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
data_dir="vits-piper-en_US-kusal-medium/espeak-ng-data",
tokens="vits-piper-en_US-kusal-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-kusal-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-kusal-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx".into()),
tokens: Some("vits-piper-en_US-kusal-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-kusal-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-kusal-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx',
tokens: 'vits-piper-en_US-kusal-medium/tokens.txt',
dataDir: 'vits-piper-en_US-kusal-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx',
tokens: 'vits-piper-en_US-kusal-medium/tokens.txt',
dataDir: 'vits-piper-en_US-kusal-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-kusal-medium/tokens.txt",
dataDir: "vits-piper-en_US-kusal-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-kusal-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-kusal-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
tokens = "vits-piper-en_US-kusal-medium/tokens.txt",
dataDir = "vits-piper-en_US-kusal-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx");
vits.setTokens("vits-piper-en_US-kusal-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-kusal-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-kusal-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-kusal-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-kusal-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-kusal-medium/en_US-kusal-medium.onnx",
Tokens: "vits-piper-en_US-kusal-medium/tokens.txt",
DataDir: "vits-piper-en_US-kusal-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-l2arctic-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/l2arctic/medium
| Number of speakers | Sample rate |
|---|---|
| 24 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-l2arctic-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
data_dir="vits-piper-en_US-l2arctic-medium/espeak-ng-data",
tokens="vits-piper-en_US-l2arctic-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx".into()),
tokens: Some("vits-piper-en_US-l2arctic-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-l2arctic-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-l2arctic-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx',
tokens: 'vits-piper-en_US-l2arctic-medium/tokens.txt',
dataDir: 'vits-piper-en_US-l2arctic-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx',
tokens: 'vits-piper-en_US-l2arctic-medium/tokens.txt',
dataDir: 'vits-piper-en_US-l2arctic-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-l2arctic-medium/tokens.txt",
dataDir: "vits-piper-en_US-l2arctic-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
tokens = "vits-piper-en_US-l2arctic-medium/tokens.txt",
dataDir = "vits-piper-en_US-l2arctic-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx");
vits.setTokens("vits-piper-en_US-l2arctic-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-l2arctic-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-l2arctic-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-l2arctic-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-l2arctic-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-l2arctic-medium/en_US-l2arctic-medium.onnx",
Tokens: "vits-piper-en_US-l2arctic-medium/tokens.txt",
DataDir: "vits-piper-en_US-l2arctic-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
Speaker 18
Speaker 19
Speaker 20
Speaker 21
Speaker 22
Speaker 23
vits-piper-en_US-lessac-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-lessac-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-lessac-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-lessac-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
data_dir="vits-piper-en_US-lessac-high/espeak-ng-data",
tokens="vits-piper-en_US-lessac-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-lessac-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-lessac-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-lessac-high/en_US-lessac-high.onnx".into()),
tokens: Some("vits-piper-en_US-lessac-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-lessac-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-lessac-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-lessac-high/en_US-lessac-high.onnx',
tokens: 'vits-piper-en_US-lessac-high/tokens.txt',
dataDir: 'vits-piper-en_US-lessac-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-lessac-high/en_US-lessac-high.onnx',
tokens: 'vits-piper-en_US-lessac-high/tokens.txt',
dataDir: 'vits-piper-en_US-lessac-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
lexicon: "",
tokens: "vits-piper-en_US-lessac-high/tokens.txt",
dataDir: "vits-piper-en_US-lessac-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-lessac-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-lessac-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
tokens = "vits-piper-en_US-lessac-high/tokens.txt",
dataDir = "vits-piper-en_US-lessac-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-lessac-high/en_US-lessac-high.onnx");
vits.setTokens("vits-piper-en_US-lessac-high/tokens.txt");
vits.setDataDir("vits-piper-en_US-lessac-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-lessac-high/en_US-lessac-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-lessac-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-lessac-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-lessac-high/en_US-lessac-high.onnx",
Tokens: "vits-piper-en_US-lessac-high/tokens.txt",
DataDir: "vits-piper-en_US-lessac-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-lessac-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-lessac-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-lessac-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-lessac-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
data_dir="vits-piper-en_US-lessac-low/espeak-ng-data",
tokens="vits-piper-en_US-lessac-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-lessac-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-lessac-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-lessac-low/en_US-lessac-low.onnx".into()),
tokens: Some("vits-piper-en_US-lessac-low/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-lessac-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-lessac-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-lessac-low/en_US-lessac-low.onnx',
tokens: 'vits-piper-en_US-lessac-low/tokens.txt',
dataDir: 'vits-piper-en_US-lessac-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-lessac-low/en_US-lessac-low.onnx',
tokens: 'vits-piper-en_US-lessac-low/tokens.txt',
dataDir: 'vits-piper-en_US-lessac-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
lexicon: "",
tokens: "vits-piper-en_US-lessac-low/tokens.txt",
dataDir: "vits-piper-en_US-lessac-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-lessac-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-lessac-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
tokens = "vits-piper-en_US-lessac-low/tokens.txt",
dataDir = "vits-piper-en_US-lessac-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-lessac-low/en_US-lessac-low.onnx");
vits.setTokens("vits-piper-en_US-lessac-low/tokens.txt");
vits.setDataDir("vits-piper-en_US-lessac-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-lessac-low/en_US-lessac-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-lessac-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-lessac-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-lessac-low/en_US-lessac-low.onnx",
Tokens: "vits-piper-en_US-lessac-low/tokens.txt",
DataDir: "vits-piper-en_US-lessac-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-lessac-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-lessac-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-lessac-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-lessac-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
data_dir="vits-piper-en_US-lessac-medium/espeak-ng-data",
tokens="vits-piper-en_US-lessac-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-lessac-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-lessac-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx".into()),
tokens: Some("vits-piper-en_US-lessac-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-lessac-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-lessac-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx',
tokens: 'vits-piper-en_US-lessac-medium/tokens.txt',
dataDir: 'vits-piper-en_US-lessac-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx',
tokens: 'vits-piper-en_US-lessac-medium/tokens.txt',
dataDir: 'vits-piper-en_US-lessac-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-lessac-medium/tokens.txt",
dataDir: "vits-piper-en_US-lessac-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-lessac-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-lessac-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
tokens = "vits-piper-en_US-lessac-medium/tokens.txt",
dataDir = "vits-piper-en_US-lessac-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx");
vits.setTokens("vits-piper-en_US-lessac-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-lessac-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-lessac-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-lessac-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-lessac-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx",
Tokens: "vits-piper-en_US-lessac-medium/tokens.txt",
DataDir: "vits-piper-en_US-lessac-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-libritts-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/libritts/high
| Number of speakers | Sample rate |
|---|---|
| 904 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-libritts-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-libritts-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-libritts-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
data_dir="vits-piper-en_US-libritts-high/espeak-ng-data",
tokens="vits-piper-en_US-libritts-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-libritts-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-libritts-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-libritts-high/en_US-libritts-high.onnx".into()),
tokens: Some("vits-piper-en_US-libritts-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-libritts-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-libritts-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-libritts-high/en_US-libritts-high.onnx',
tokens: 'vits-piper-en_US-libritts-high/tokens.txt',
dataDir: 'vits-piper-en_US-libritts-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-libritts-high/en_US-libritts-high.onnx',
tokens: 'vits-piper-en_US-libritts-high/tokens.txt',
dataDir: 'vits-piper-en_US-libritts-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
lexicon: "",
tokens: "vits-piper-en_US-libritts-high/tokens.txt",
dataDir: "vits-piper-en_US-libritts-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-libritts-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-libritts-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
tokens = "vits-piper-en_US-libritts-high/tokens.txt",
dataDir = "vits-piper-en_US-libritts-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-libritts-high/en_US-libritts-high.onnx");
vits.setTokens("vits-piper-en_US-libritts-high/tokens.txt");
vits.setDataDir("vits-piper-en_US-libritts-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-libritts-high/en_US-libritts-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-libritts-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-libritts-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-libritts-high/en_US-libritts-high.onnx",
Tokens: "vits-piper-en_US-libritts-high/tokens.txt",
DataDir: "vits-piper-en_US-libritts-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
Speaker 18
Speaker 19
Speaker 20
Speaker 21
Speaker 22
Speaker 23
Speaker 24
Speaker 25
Speaker 26
Speaker 27
Speaker 28
Speaker 29
Speaker 30
Speaker 31
Speaker 32
Speaker 33
Speaker 34
Speaker 35
Speaker 36
Speaker 37
Speaker 38
Speaker 39
Speaker 40
Speaker 41
Speaker 42
Speaker 43
Speaker 44
Speaker 45
Speaker 46
Speaker 47
Speaker 48
Speaker 49
Speaker 50
Speaker 51
Speaker 52
Speaker 53
Speaker 54
Speaker 55
Speaker 56
Speaker 57
Speaker 58
Speaker 59
Speaker 60
Speaker 61
Speaker 62
Speaker 63
Speaker 64
Speaker 65
Speaker 66
Speaker 67
Speaker 68
Speaker 69
Speaker 70
Speaker 71
Speaker 72
Speaker 73
Speaker 74
Speaker 75
Speaker 76
Speaker 77
Speaker 78
Speaker 79
Speaker 80
Speaker 81
Speaker 82
Speaker 83
Speaker 84
Speaker 85
Speaker 86
Speaker 87
Speaker 88
Speaker 89
Speaker 90
Speaker 91
Speaker 92
Speaker 93
Speaker 94
Speaker 95
Speaker 96
Speaker 97
Speaker 98
Speaker 99
Speaker 100
Speaker 101
Speaker 102
Speaker 103
Speaker 104
Speaker 105
Speaker 106
Speaker 107
Speaker 108
Speaker 109
Speaker 110
Speaker 111
Speaker 112
Speaker 113
Speaker 114
Speaker 115
Speaker 116
Speaker 117
Speaker 118
Speaker 119
Speaker 120
Speaker 121
Speaker 122
Speaker 123
Speaker 124
Speaker 125
Speaker 126
Speaker 127
Speaker 128
Speaker 129
Speaker 130
Speaker 131
Speaker 132
Speaker 133
Speaker 134
Speaker 135
Speaker 136
Speaker 137
Speaker 138
Speaker 139
Speaker 140
Speaker 141
Speaker 142
Speaker 143
Speaker 144
Speaker 145
Speaker 146
Speaker 147
Speaker 148
Speaker 149
Speaker 150
Speaker 151
Speaker 152
Speaker 153
Speaker 154
Speaker 155
Speaker 156
Speaker 157
Speaker 158
Speaker 159
Speaker 160
Speaker 161
Speaker 162
Speaker 163
Speaker 164
Speaker 165
Speaker 166
Speaker 167
Speaker 168
Speaker 169
Speaker 170
Speaker 171
Speaker 172
Speaker 173
Speaker 174
Speaker 175
Speaker 176
Speaker 177
Speaker 178
Speaker 179
Speaker 180
Speaker 181
Speaker 182
Speaker 183
Speaker 184
Speaker 185
Speaker 186
Speaker 187
Speaker 188
Speaker 189
Speaker 190
Speaker 191
Speaker 192
Speaker 193
Speaker 194
Speaker 195
Speaker 196
Speaker 197
Speaker 198
Speaker 199
Speaker 200
Speaker 201
Speaker 202
Speaker 203
Speaker 204
Speaker 205
Speaker 206
Speaker 207
Speaker 208
Speaker 209
Speaker 210
Speaker 211
Speaker 212
Speaker 213
Speaker 214
Speaker 215
Speaker 216
Speaker 217
Speaker 218
Speaker 219
Speaker 220
Speaker 221
Speaker 222
Speaker 223
Speaker 224
Speaker 225
Speaker 226
Speaker 227
Speaker 228
Speaker 229
Speaker 230
Speaker 231
Speaker 232
Speaker 233
Speaker 234
Speaker 235
Speaker 236
Speaker 237
Speaker 238
Speaker 239
Speaker 240
Speaker 241
Speaker 242
Speaker 243
Speaker 244
Speaker 245
Speaker 246
Speaker 247
Speaker 248
Speaker 249
Speaker 250
Speaker 251
Speaker 252
Speaker 253
Speaker 254
Speaker 255
Speaker 256
Speaker 257
Speaker 258
Speaker 259
Speaker 260
Speaker 261
Speaker 262
Speaker 263
Speaker 264
Speaker 265
Speaker 266
Speaker 267
Speaker 268
Speaker 269
Speaker 270
Speaker 271
Speaker 272
Speaker 273
Speaker 274
Speaker 275
Speaker 276
Speaker 277
Speaker 278
Speaker 279
Speaker 280
Speaker 281
Speaker 282
Speaker 283
Speaker 284
Speaker 285
Speaker 286
Speaker 287
Speaker 288
Speaker 289
Speaker 290
Speaker 291
Speaker 292
Speaker 293
Speaker 294
Speaker 295
Speaker 296
Speaker 297
Speaker 298
Speaker 299
Speaker 300
Speaker 301
Speaker 302
Speaker 303
Speaker 304
Speaker 305
Speaker 306
Speaker 307
Speaker 308
Speaker 309
Speaker 310
Speaker 311
Speaker 312
Speaker 313
Speaker 314
Speaker 315
Speaker 316
Speaker 317
Speaker 318
Speaker 319
Speaker 320
Speaker 321
Speaker 322
Speaker 323
Speaker 324
Speaker 325
Speaker 326
Speaker 327
Speaker 328
Speaker 329
Speaker 330
Speaker 331
Speaker 332
Speaker 333
Speaker 334
Speaker 335
Speaker 336
Speaker 337
Speaker 338
Speaker 339
Speaker 340
Speaker 341
Speaker 342
Speaker 343
Speaker 344
Speaker 345
Speaker 346
Speaker 347
Speaker 348
Speaker 349
Speaker 350
Speaker 351
Speaker 352
Speaker 353
Speaker 354
Speaker 355
Speaker 356
Speaker 357
Speaker 358
Speaker 359
Speaker 360
Speaker 361
Speaker 362
Speaker 363
Speaker 364
Speaker 365
Speaker 366
Speaker 367
Speaker 368
Speaker 369
Speaker 370
Speaker 371
Speaker 372
Speaker 373
Speaker 374
Speaker 375
Speaker 376
Speaker 377
Speaker 378
Speaker 379
Speaker 380
Speaker 381
Speaker 382
Speaker 383
Speaker 384
Speaker 385
Speaker 386
Speaker 387
Speaker 388
Speaker 389
Speaker 390
Speaker 391
Speaker 392
Speaker 393
Speaker 394
Speaker 395
Speaker 396
Speaker 397
Speaker 398
Speaker 399
Speaker 400
Speaker 401
Speaker 402
Speaker 403
Speaker 404
Speaker 405
Speaker 406
Speaker 407
Speaker 408
Speaker 409
Speaker 410
Speaker 411
Speaker 412
Speaker 413
Speaker 414
Speaker 415
Speaker 416
Speaker 417
Speaker 418
Speaker 419
Speaker 420
Speaker 421
Speaker 422
Speaker 423
Speaker 424
Speaker 425
Speaker 426
Speaker 427
Speaker 428
Speaker 429
Speaker 430
Speaker 431
Speaker 432
Speaker 433
Speaker 434
Speaker 435
Speaker 436
Speaker 437
Speaker 438
Speaker 439
Speaker 440
Speaker 441
Speaker 442
Speaker 443
Speaker 444
Speaker 445
Speaker 446
Speaker 447
Speaker 448
Speaker 449
Speaker 450
Speaker 451
Speaker 452
Speaker 453
Speaker 454
Speaker 455
Speaker 456
Speaker 457
Speaker 458
Speaker 459
Speaker 460
Speaker 461
Speaker 462
Speaker 463
Speaker 464
Speaker 465
Speaker 466
Speaker 467
Speaker 468
Speaker 469
Speaker 470
Speaker 471
Speaker 472
Speaker 473
Speaker 474
Speaker 475
Speaker 476
Speaker 477
Speaker 478
Speaker 479
Speaker 480
Speaker 481
Speaker 482
Speaker 483
Speaker 484
Speaker 485
Speaker 486
Speaker 487
Speaker 488
Speaker 489
Speaker 490
Speaker 491
Speaker 492
Speaker 493
Speaker 494
Speaker 495
Speaker 496
Speaker 497
Speaker 498
Speaker 499
Speaker 500
Speaker 501
Speaker 502
Speaker 503
Speaker 504
Speaker 505
Speaker 506
Speaker 507
Speaker 508
Speaker 509
Speaker 510
Speaker 511
Speaker 512
Speaker 513
Speaker 514
Speaker 515
Speaker 516
Speaker 517
Speaker 518
Speaker 519
Speaker 520
Speaker 521
Speaker 522
Speaker 523
Speaker 524
Speaker 525
Speaker 526
Speaker 527
Speaker 528
Speaker 529
Speaker 530
Speaker 531
Speaker 532
Speaker 533
Speaker 534
Speaker 535
Speaker 536
Speaker 537
Speaker 538
Speaker 539
Speaker 540
Speaker 541
Speaker 542
Speaker 543
Speaker 544
Speaker 545
Speaker 546
Speaker 547
Speaker 548
Speaker 549
Speaker 550
Speaker 551
Speaker 552
Speaker 553
Speaker 554
Speaker 555
Speaker 556
Speaker 557
Speaker 558
Speaker 559
Speaker 560
Speaker 561
Speaker 562
Speaker 563
Speaker 564
Speaker 565
Speaker 566
Speaker 567
Speaker 568
Speaker 569
Speaker 570
Speaker 571
Speaker 572
Speaker 573
Speaker 574
Speaker 575
Speaker 576
Speaker 577
Speaker 578
Speaker 579
Speaker 580
Speaker 581
Speaker 582
Speaker 583
Speaker 584
Speaker 585
Speaker 586
Speaker 587
Speaker 588
Speaker 589
Speaker 590
Speaker 591
Speaker 592
Speaker 593
Speaker 594
Speaker 595
Speaker 596
Speaker 597
Speaker 598
Speaker 599
Speaker 600
Speaker 601
Speaker 602
Speaker 603
Speaker 604
Speaker 605
Speaker 606
Speaker 607
Speaker 608
Speaker 609
Speaker 610
Speaker 611
Speaker 612
Speaker 613
Speaker 614
Speaker 615
Speaker 616
Speaker 617
Speaker 618
Speaker 619
Speaker 620
Speaker 621
Speaker 622
Speaker 623
Speaker 624
Speaker 625
Speaker 626
Speaker 627
Speaker 628
Speaker 629
Speaker 630
Speaker 631
Speaker 632
Speaker 633
Speaker 634
Speaker 635
Speaker 636
Speaker 637
Speaker 638
Speaker 639
Speaker 640
Speaker 641
Speaker 642
Speaker 643
Speaker 644
Speaker 645
Speaker 646
Speaker 647
Speaker 648
Speaker 649
Speaker 650
Speaker 651
Speaker 652
Speaker 653
Speaker 654
Speaker 655
Speaker 656
Speaker 657
Speaker 658
Speaker 659
Speaker 660
Speaker 661
Speaker 662
Speaker 663
Speaker 664
Speaker 665
Speaker 666
Speaker 667
Speaker 668
Speaker 669
Speaker 670
Speaker 671
Speaker 672
Speaker 673
Speaker 674
Speaker 675
Speaker 676
Speaker 677
Speaker 678
Speaker 679
Speaker 680
Speaker 681
Speaker 682
Speaker 683
Speaker 684
Speaker 685
Speaker 686
Speaker 687
Speaker 688
Speaker 689
Speaker 690
Speaker 691
Speaker 692
Speaker 693
Speaker 694
Speaker 695
Speaker 696
Speaker 697
Speaker 698
Speaker 699
Speaker 700
Speaker 701
Speaker 702
Speaker 703
Speaker 704
Speaker 705
Speaker 706
Speaker 707
Speaker 708
Speaker 709
Speaker 710
Speaker 711
Speaker 712
Speaker 713
Speaker 714
Speaker 715
Speaker 716
Speaker 717
Speaker 718
Speaker 719
Speaker 720
Speaker 721
Speaker 722
Speaker 723
Speaker 724
Speaker 725
Speaker 726
Speaker 727
Speaker 728
Speaker 729
Speaker 730
Speaker 731
Speaker 732
Speaker 733
Speaker 734
Speaker 735
Speaker 736
Speaker 737
Speaker 738
Speaker 739
Speaker 740
Speaker 741
Speaker 742
Speaker 743
Speaker 744
Speaker 745
Speaker 746
Speaker 747
Speaker 748
Speaker 749
Speaker 750
Speaker 751
Speaker 752
Speaker 753
Speaker 754
Speaker 755
Speaker 756
Speaker 757
Speaker 758
Speaker 759
Speaker 760
Speaker 761
Speaker 762
Speaker 763
Speaker 764
Speaker 765
Speaker 766
Speaker 767
Speaker 768
Speaker 769
Speaker 770
Speaker 771
Speaker 772
Speaker 773
Speaker 774
Speaker 775
Speaker 776
Speaker 777
Speaker 778
Speaker 779
Speaker 780
Speaker 781
Speaker 782
Speaker 783
Speaker 784
Speaker 785
Speaker 786
Speaker 787
Speaker 788
Speaker 789
Speaker 790
Speaker 791
Speaker 792
Speaker 793
Speaker 794
Speaker 795
Speaker 796
Speaker 797
Speaker 798
Speaker 799
Speaker 800
Speaker 801
Speaker 802
Speaker 803
Speaker 804
Speaker 805
Speaker 806
Speaker 807
Speaker 808
Speaker 809
Speaker 810
Speaker 811
Speaker 812
Speaker 813
Speaker 814
Speaker 815
Speaker 816
Speaker 817
Speaker 818
Speaker 819
Speaker 820
Speaker 821
Speaker 822
Speaker 823
Speaker 824
Speaker 825
Speaker 826
Speaker 827
Speaker 828
Speaker 829
Speaker 830
Speaker 831
Speaker 832
Speaker 833
Speaker 834
Speaker 835
Speaker 836
Speaker 837
Speaker 838
Speaker 839
Speaker 840
Speaker 841
Speaker 842
Speaker 843
Speaker 844
Speaker 845
Speaker 846
Speaker 847
Speaker 848
Speaker 849
Speaker 850
Speaker 851
Speaker 852
Speaker 853
Speaker 854
Speaker 855
Speaker 856
Speaker 857
Speaker 858
Speaker 859
Speaker 860
Speaker 861
Speaker 862
Speaker 863
Speaker 864
Speaker 865
Speaker 866
Speaker 867
Speaker 868
Speaker 869
Speaker 870
Speaker 871
Speaker 872
Speaker 873
Speaker 874
Speaker 875
Speaker 876
Speaker 877
Speaker 878
Speaker 879
Speaker 880
Speaker 881
Speaker 882
Speaker 883
Speaker 884
Speaker 885
Speaker 886
Speaker 887
Speaker 888
Speaker 889
Speaker 890
Speaker 891
Speaker 892
Speaker 893
Speaker 894
Speaker 895
Speaker 896
Speaker 897
Speaker 898
Speaker 899
Speaker 900
Speaker 901
Speaker 902
Speaker 903
vits-piper-en_US-libritts_r-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/libritts_r/medium
| Number of speakers | Sample rate |
|---|---|
| 904 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-libritts_r-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
data_dir="vits-piper-en_US-libritts_r-medium/espeak-ng-data",
tokens="vits-piper-en_US-libritts_r-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx".into()),
tokens: Some("vits-piper-en_US-libritts_r-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-libritts_r-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-libritts_r-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx',
tokens: 'vits-piper-en_US-libritts_r-medium/tokens.txt',
dataDir: 'vits-piper-en_US-libritts_r-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx',
tokens: 'vits-piper-en_US-libritts_r-medium/tokens.txt',
dataDir: 'vits-piper-en_US-libritts_r-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-libritts_r-medium/tokens.txt",
dataDir: "vits-piper-en_US-libritts_r-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
tokens = "vits-piper-en_US-libritts_r-medium/tokens.txt",
dataDir = "vits-piper-en_US-libritts_r-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx");
vits.setTokens("vits-piper-en_US-libritts_r-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-libritts_r-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-libritts_r-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-libritts_r-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-libritts_r-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx",
Tokens: "vits-piper-en_US-libritts_r-medium/tokens.txt",
DataDir: "vits-piper-en_US-libritts_r-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
Speaker 18
Speaker 19
Speaker 20
Speaker 21
Speaker 22
Speaker 23
Speaker 24
Speaker 25
Speaker 26
Speaker 27
Speaker 28
Speaker 29
Speaker 30
Speaker 31
Speaker 32
Speaker 33
Speaker 34
Speaker 35
Speaker 36
Speaker 37
Speaker 38
Speaker 39
Speaker 40
Speaker 41
Speaker 42
Speaker 43
Speaker 44
Speaker 45
Speaker 46
Speaker 47
Speaker 48
Speaker 49
Speaker 50
Speaker 51
Speaker 52
Speaker 53
Speaker 54
Speaker 55
Speaker 56
Speaker 57
Speaker 58
Speaker 59
Speaker 60
Speaker 61
Speaker 62
Speaker 63
Speaker 64
Speaker 65
Speaker 66
Speaker 67
Speaker 68
Speaker 69
Speaker 70
Speaker 71
Speaker 72
Speaker 73
Speaker 74
Speaker 75
Speaker 76
Speaker 77
Speaker 78
Speaker 79
Speaker 80
Speaker 81
Speaker 82
Speaker 83
Speaker 84
Speaker 85
Speaker 86
Speaker 87
Speaker 88
Speaker 89
Speaker 90
Speaker 91
Speaker 92
Speaker 93
Speaker 94
Speaker 95
Speaker 96
Speaker 97
Speaker 98
Speaker 99
Speaker 100
Speaker 101
Speaker 102
Speaker 103
Speaker 104
Speaker 105
Speaker 106
Speaker 107
Speaker 108
Speaker 109
Speaker 110
Speaker 111
Speaker 112
Speaker 113
Speaker 114
Speaker 115
Speaker 116
Speaker 117
Speaker 118
Speaker 119
Speaker 120
Speaker 121
Speaker 122
Speaker 123
Speaker 124
Speaker 125
Speaker 126
Speaker 127
Speaker 128
Speaker 129
Speaker 130
Speaker 131
Speaker 132
Speaker 133
Speaker 134
Speaker 135
Speaker 136
Speaker 137
Speaker 138
Speaker 139
Speaker 140
Speaker 141
Speaker 142
Speaker 143
Speaker 144
Speaker 145
Speaker 146
Speaker 147
Speaker 148
Speaker 149
Speaker 150
Speaker 151
Speaker 152
Speaker 153
Speaker 154
Speaker 155
Speaker 156
Speaker 157
Speaker 158
Speaker 159
Speaker 160
Speaker 161
Speaker 162
Speaker 163
Speaker 164
Speaker 165
Speaker 166
Speaker 167
Speaker 168
Speaker 169
Speaker 170
Speaker 171
Speaker 172
Speaker 173
Speaker 174
Speaker 175
Speaker 176
Speaker 177
Speaker 178
Speaker 179
Speaker 180
Speaker 181
Speaker 182
Speaker 183
Speaker 184
Speaker 185
Speaker 186
Speaker 187
Speaker 188
Speaker 189
Speaker 190
Speaker 191
Speaker 192
Speaker 193
Speaker 194
Speaker 195
Speaker 196
Speaker 197
Speaker 198
Speaker 199
Speaker 200
Speaker 201
Speaker 202
Speaker 203
Speaker 204
Speaker 205
Speaker 206
Speaker 207
Speaker 208
Speaker 209
Speaker 210
Speaker 211
Speaker 212
Speaker 213
Speaker 214
Speaker 215
Speaker 216
Speaker 217
Speaker 218
Speaker 219
Speaker 220
Speaker 221
Speaker 222
Speaker 223
Speaker 224
Speaker 225
Speaker 226
Speaker 227
Speaker 228
Speaker 229
Speaker 230
Speaker 231
Speaker 232
Speaker 233
Speaker 234
Speaker 235
Speaker 236
Speaker 237
Speaker 238
Speaker 239
Speaker 240
Speaker 241
Speaker 242
Speaker 243
Speaker 244
Speaker 245
Speaker 246
Speaker 247
Speaker 248
Speaker 249
Speaker 250
Speaker 251
Speaker 252
Speaker 253
Speaker 254
Speaker 255
Speaker 256
Speaker 257
Speaker 258
Speaker 259
Speaker 260
Speaker 261
Speaker 262
Speaker 263
Speaker 264
Speaker 265
Speaker 266
Speaker 267
Speaker 268
Speaker 269
Speaker 270
Speaker 271
Speaker 272
Speaker 273
Speaker 274
Speaker 275
Speaker 276
Speaker 277
Speaker 278
Speaker 279
Speaker 280
Speaker 281
Speaker 282
Speaker 283
Speaker 284
Speaker 285
Speaker 286
Speaker 287
Speaker 288
Speaker 289
Speaker 290
Speaker 291
Speaker 292
Speaker 293
Speaker 294
Speaker 295
Speaker 296
Speaker 297
Speaker 298
Speaker 299
Speaker 300
Speaker 301
Speaker 302
Speaker 303
Speaker 304
Speaker 305
Speaker 306
Speaker 307
Speaker 308
Speaker 309
Speaker 310
Speaker 311
Speaker 312
Speaker 313
Speaker 314
Speaker 315
Speaker 316
Speaker 317
Speaker 318
Speaker 319
Speaker 320
Speaker 321
Speaker 322
Speaker 323
Speaker 324
Speaker 325
Speaker 326
Speaker 327
Speaker 328
Speaker 329
Speaker 330
Speaker 331
Speaker 332
Speaker 333
Speaker 334
Speaker 335
Speaker 336
Speaker 337
Speaker 338
Speaker 339
Speaker 340
Speaker 341
Speaker 342
Speaker 343
Speaker 344
Speaker 345
Speaker 346
Speaker 347
Speaker 348
Speaker 349
Speaker 350
Speaker 351
Speaker 352
Speaker 353
Speaker 354
Speaker 355
Speaker 356
Speaker 357
Speaker 358
Speaker 359
Speaker 360
Speaker 361
Speaker 362
Speaker 363
Speaker 364
Speaker 365
Speaker 366
Speaker 367
Speaker 368
Speaker 369
Speaker 370
Speaker 371
Speaker 372
Speaker 373
Speaker 374
Speaker 375
Speaker 376
Speaker 377
Speaker 378
Speaker 379
Speaker 380
Speaker 381
Speaker 382
Speaker 383
Speaker 384
Speaker 385
Speaker 386
Speaker 387
Speaker 388
Speaker 389
Speaker 390
Speaker 391
Speaker 392
Speaker 393
Speaker 394
Speaker 395
Speaker 396
Speaker 397
Speaker 398
Speaker 399
Speaker 400
Speaker 401
Speaker 402
Speaker 403
Speaker 404
Speaker 405
Speaker 406
Speaker 407
Speaker 408
Speaker 409
Speaker 410
Speaker 411
Speaker 412
Speaker 413
Speaker 414
Speaker 415
Speaker 416
Speaker 417
Speaker 418
Speaker 419
Speaker 420
Speaker 421
Speaker 422
Speaker 423
Speaker 424
Speaker 425
Speaker 426
Speaker 427
Speaker 428
Speaker 429
Speaker 430
Speaker 431
Speaker 432
Speaker 433
Speaker 434
Speaker 435
Speaker 436
Speaker 437
Speaker 438
Speaker 439
Speaker 440
Speaker 441
Speaker 442
Speaker 443
Speaker 444
Speaker 445
Speaker 446
Speaker 447
Speaker 448
Speaker 449
Speaker 450
Speaker 451
Speaker 452
Speaker 453
Speaker 454
Speaker 455
Speaker 456
Speaker 457
Speaker 458
Speaker 459
Speaker 460
Speaker 461
Speaker 462
Speaker 463
Speaker 464
Speaker 465
Speaker 466
Speaker 467
Speaker 468
Speaker 469
Speaker 470
Speaker 471
Speaker 472
Speaker 473
Speaker 474
Speaker 475
Speaker 476
Speaker 477
Speaker 478
Speaker 479
Speaker 480
Speaker 481
Speaker 482
Speaker 483
Speaker 484
Speaker 485
Speaker 486
Speaker 487
Speaker 488
Speaker 489
Speaker 490
Speaker 491
Speaker 492
Speaker 493
Speaker 494
Speaker 495
Speaker 496
Speaker 497
Speaker 498
Speaker 499
Speaker 500
Speaker 501
Speaker 502
Speaker 503
Speaker 504
Speaker 505
Speaker 506
Speaker 507
Speaker 508
Speaker 509
Speaker 510
Speaker 511
Speaker 512
Speaker 513
Speaker 514
Speaker 515
Speaker 516
Speaker 517
Speaker 518
Speaker 519
Speaker 520
Speaker 521
Speaker 522
Speaker 523
Speaker 524
Speaker 525
Speaker 526
Speaker 527
Speaker 528
Speaker 529
Speaker 530
Speaker 531
Speaker 532
Speaker 533
Speaker 534
Speaker 535
Speaker 536
Speaker 537
Speaker 538
Speaker 539
Speaker 540
Speaker 541
Speaker 542
Speaker 543
Speaker 544
Speaker 545
Speaker 546
Speaker 547
Speaker 548
Speaker 549
Speaker 550
Speaker 551
Speaker 552
Speaker 553
Speaker 554
Speaker 555
Speaker 556
Speaker 557
Speaker 558
Speaker 559
Speaker 560
Speaker 561
Speaker 562
Speaker 563
Speaker 564
Speaker 565
Speaker 566
Speaker 567
Speaker 568
Speaker 569
Speaker 570
Speaker 571
Speaker 572
Speaker 573
Speaker 574
Speaker 575
Speaker 576
Speaker 577
Speaker 578
Speaker 579
Speaker 580
Speaker 581
Speaker 582
Speaker 583
Speaker 584
Speaker 585
Speaker 586
Speaker 587
Speaker 588
Speaker 589
Speaker 590
Speaker 591
Speaker 592
Speaker 593
Speaker 594
Speaker 595
Speaker 596
Speaker 597
Speaker 598
Speaker 599
Speaker 600
Speaker 601
Speaker 602
Speaker 603
Speaker 604
Speaker 605
Speaker 606
Speaker 607
Speaker 608
Speaker 609
Speaker 610
Speaker 611
Speaker 612
Speaker 613
Speaker 614
Speaker 615
Speaker 616
Speaker 617
Speaker 618
Speaker 619
Speaker 620
Speaker 621
Speaker 622
Speaker 623
Speaker 624
Speaker 625
Speaker 626
Speaker 627
Speaker 628
Speaker 629
Speaker 630
Speaker 631
Speaker 632
Speaker 633
Speaker 634
Speaker 635
Speaker 636
Speaker 637
Speaker 638
Speaker 639
Speaker 640
Speaker 641
Speaker 642
Speaker 643
Speaker 644
Speaker 645
Speaker 646
Speaker 647
Speaker 648
Speaker 649
Speaker 650
Speaker 651
Speaker 652
Speaker 653
Speaker 654
Speaker 655
Speaker 656
Speaker 657
Speaker 658
Speaker 659
Speaker 660
Speaker 661
Speaker 662
Speaker 663
Speaker 664
Speaker 665
Speaker 666
Speaker 667
Speaker 668
Speaker 669
Speaker 670
Speaker 671
Speaker 672
Speaker 673
Speaker 674
Speaker 675
Speaker 676
Speaker 677
Speaker 678
Speaker 679
Speaker 680
Speaker 681
Speaker 682
Speaker 683
Speaker 684
Speaker 685
Speaker 686
Speaker 687
Speaker 688
Speaker 689
Speaker 690
Speaker 691
Speaker 692
Speaker 693
Speaker 694
Speaker 695
Speaker 696
Speaker 697
Speaker 698
Speaker 699
Speaker 700
Speaker 701
Speaker 702
Speaker 703
Speaker 704
Speaker 705
Speaker 706
Speaker 707
Speaker 708
Speaker 709
Speaker 710
Speaker 711
Speaker 712
Speaker 713
Speaker 714
Speaker 715
Speaker 716
Speaker 717
Speaker 718
Speaker 719
Speaker 720
Speaker 721
Speaker 722
Speaker 723
Speaker 724
Speaker 725
Speaker 726
Speaker 727
Speaker 728
Speaker 729
Speaker 730
Speaker 731
Speaker 732
Speaker 733
Speaker 734
Speaker 735
Speaker 736
Speaker 737
Speaker 738
Speaker 739
Speaker 740
Speaker 741
Speaker 742
Speaker 743
Speaker 744
Speaker 745
Speaker 746
Speaker 747
Speaker 748
Speaker 749
Speaker 750
Speaker 751
Speaker 752
Speaker 753
Speaker 754
Speaker 755
Speaker 756
Speaker 757
Speaker 758
Speaker 759
Speaker 760
Speaker 761
Speaker 762
Speaker 763
Speaker 764
Speaker 765
Speaker 766
Speaker 767
Speaker 768
Speaker 769
Speaker 770
Speaker 771
Speaker 772
Speaker 773
Speaker 774
Speaker 775
Speaker 776
Speaker 777
Speaker 778
Speaker 779
Speaker 780
Speaker 781
Speaker 782
Speaker 783
Speaker 784
Speaker 785
Speaker 786
Speaker 787
Speaker 788
Speaker 789
Speaker 790
Speaker 791
Speaker 792
Speaker 793
Speaker 794
Speaker 795
Speaker 796
Speaker 797
Speaker 798
Speaker 799
Speaker 800
Speaker 801
Speaker 802
Speaker 803
Speaker 804
Speaker 805
Speaker 806
Speaker 807
Speaker 808
Speaker 809
Speaker 810
Speaker 811
Speaker 812
Speaker 813
Speaker 814
Speaker 815
Speaker 816
Speaker 817
Speaker 818
Speaker 819
Speaker 820
Speaker 821
Speaker 822
Speaker 823
Speaker 824
Speaker 825
Speaker 826
Speaker 827
Speaker 828
Speaker 829
Speaker 830
Speaker 831
Speaker 832
Speaker 833
Speaker 834
Speaker 835
Speaker 836
Speaker 837
Speaker 838
Speaker 839
Speaker 840
Speaker 841
Speaker 842
Speaker 843
Speaker 844
Speaker 845
Speaker 846
Speaker 847
Speaker 848
Speaker 849
Speaker 850
Speaker 851
Speaker 852
Speaker 853
Speaker 854
Speaker 855
Speaker 856
Speaker 857
Speaker 858
Speaker 859
Speaker 860
Speaker 861
Speaker 862
Speaker 863
Speaker 864
Speaker 865
Speaker 866
Speaker 867
Speaker 868
Speaker 869
Speaker 870
Speaker 871
Speaker 872
Speaker 873
Speaker 874
Speaker 875
Speaker 876
Speaker 877
Speaker 878
Speaker 879
Speaker 880
Speaker 881
Speaker 882
Speaker 883
Speaker 884
Speaker 885
Speaker 886
Speaker 887
Speaker 888
Speaker 889
Speaker 890
Speaker 891
Speaker 892
Speaker 893
Speaker 894
Speaker 895
Speaker 896
Speaker 897
Speaker 898
Speaker 899
Speaker 900
Speaker 901
Speaker 902
Speaker 903
vits-piper-en_US-ljspeech-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ljspeech/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-ljspeech-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ljspeech-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-ljspeech-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
data_dir="vits-piper-en_US-ljspeech-high/espeak-ng-data",
tokens="vits-piper-en_US-ljspeech-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-ljspeech-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ljspeech-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx".into()),
tokens: Some("vits-piper-en_US-ljspeech-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-ljspeech-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-ljspeech-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx',
tokens: 'vits-piper-en_US-ljspeech-high/tokens.txt',
dataDir: 'vits-piper-en_US-ljspeech-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx',
tokens: 'vits-piper-en_US-ljspeech-high/tokens.txt',
dataDir: 'vits-piper-en_US-ljspeech-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
lexicon: "",
tokens: "vits-piper-en_US-ljspeech-high/tokens.txt",
dataDir: "vits-piper-en_US-ljspeech-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ljspeech-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ljspeech-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
tokens = "vits-piper-en_US-ljspeech-high/tokens.txt",
dataDir = "vits-piper-en_US-ljspeech-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx");
vits.setTokens("vits-piper-en_US-ljspeech-high/tokens.txt");
vits.setDataDir("vits-piper-en_US-ljspeech-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-ljspeech-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-ljspeech-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-ljspeech-high/en_US-ljspeech-high.onnx",
Tokens: "vits-piper-en_US-ljspeech-high/tokens.txt",
DataDir: "vits-piper-en_US-ljspeech-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-ljspeech-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ljspeech/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-ljspeech-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
data_dir="vits-piper-en_US-ljspeech-medium/espeak-ng-data",
tokens="vits-piper-en_US-ljspeech-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx".into()),
tokens: Some("vits-piper-en_US-ljspeech-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-ljspeech-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-ljspeech-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx',
tokens: 'vits-piper-en_US-ljspeech-medium/tokens.txt',
dataDir: 'vits-piper-en_US-ljspeech-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx',
tokens: 'vits-piper-en_US-ljspeech-medium/tokens.txt',
dataDir: 'vits-piper-en_US-ljspeech-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-ljspeech-medium/tokens.txt",
dataDir: "vits-piper-en_US-ljspeech-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
tokens = "vits-piper-en_US-ljspeech-medium/tokens.txt",
dataDir = "vits-piper-en_US-ljspeech-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx");
vits.setTokens("vits-piper-en_US-ljspeech-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-ljspeech-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-ljspeech-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-ljspeech-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-ljspeech-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-ljspeech-medium/en_US-ljspeech-medium.onnx",
Tokens: "vits-piper-en_US-ljspeech-medium/tokens.txt",
DataDir: "vits-piper-en_US-ljspeech-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_en-US_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-miro-high/en_US-miro-high.onnx",
data_dir="vits-piper-en_US-miro-high/espeak-ng-data",
tokens="vits-piper-en_US-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-miro-high/en_US-miro-high.onnx".into()),
tokens: Some("vits-piper-en_US-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-miro-high/en_US-miro-high.onnx',
tokens: 'vits-piper-en_US-miro-high/tokens.txt',
dataDir: 'vits-piper-en_US-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-miro-high/en_US-miro-high.onnx',
tokens: 'vits-piper-en_US-miro-high/tokens.txt',
dataDir: 'vits-piper-en_US-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-miro-high/en_US-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-en_US-miro-high/tokens.txt",
dataDir: "vits-piper-en_US-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-miro-high/en_US-miro-high.onnx",
tokens = "vits-piper-en_US-miro-high/tokens.txt",
dataDir = "vits-piper-en_US-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-miro-high/en_US-miro-high.onnx");
vits.setTokens("vits-piper-en_US-miro-high/tokens.txt");
vits.setDataDir("vits-piper-en_US-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-miro-high/en_US-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-miro-high/en_US-miro-high.onnx",
Tokens: "vits-piper-en_US-miro-high/tokens.txt",
DataDir: "vits-piper-en_US-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-norman-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/norman/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-norman-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-norman-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-norman-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
data_dir="vits-piper-en_US-norman-medium/espeak-ng-data",
tokens="vits-piper-en_US-norman-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-norman-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-norman-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-norman-medium/en_US-norman-medium.onnx".into()),
tokens: Some("vits-piper-en_US-norman-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-norman-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-norman-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-norman-medium/en_US-norman-medium.onnx',
tokens: 'vits-piper-en_US-norman-medium/tokens.txt',
dataDir: 'vits-piper-en_US-norman-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-norman-medium/en_US-norman-medium.onnx',
tokens: 'vits-piper-en_US-norman-medium/tokens.txt',
dataDir: 'vits-piper-en_US-norman-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-norman-medium/tokens.txt",
dataDir: "vits-piper-en_US-norman-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-norman-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-norman-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
tokens = "vits-piper-en_US-norman-medium/tokens.txt",
dataDir = "vits-piper-en_US-norman-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-norman-medium/en_US-norman-medium.onnx");
vits.setTokens("vits-piper-en_US-norman-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-norman-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-norman-medium/en_US-norman-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-norman-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-norman-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-norman-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-norman-medium/en_US-norman-medium.onnx",
Tokens: "vits-piper-en_US-norman-medium/tokens.txt",
DataDir: "vits-piper-en_US-norman-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-reza_ibrahim-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/reza_ibrahim/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
data_dir="vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data",
tokens="vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx".into()),
tokens: Some("vits-piper-en_US-reza_ibrahim-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx',
tokens: 'vits-piper-en_US-reza_ibrahim-medium/tokens.txt',
dataDir: 'vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx',
tokens: 'vits-piper-en_US-reza_ibrahim-medium/tokens.txt',
dataDir: 'vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
dataDir: "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
tokens = "vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
dataDir = "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx");
vits.setTokens("vits-piper-en_US-reza_ibrahim-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-reza_ibrahim-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-reza_ibrahim-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-reza_ibrahim-medium/en_US-reza_ibrahim-medium.onnx",
Tokens: "vits-piper-en_US-reza_ibrahim-medium/tokens.txt",
DataDir: "vits-piper-en_US-reza_ibrahim-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-ryan-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ryan/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-ryan-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ryan-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-ryan-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
data_dir="vits-piper-en_US-ryan-high/espeak-ng-data",
tokens="vits-piper-en_US-ryan-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx";
config.model.vits.tokens = "vits-piper-en_US-ryan-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ryan-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-ryan-high/en_US-ryan-high.onnx".into()),
tokens: Some("vits-piper-en_US-ryan-high/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-ryan-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-ryan-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-ryan-high/en_US-ryan-high.onnx',
tokens: 'vits-piper-en_US-ryan-high/tokens.txt',
dataDir: 'vits-piper-en_US-ryan-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-ryan-high/en_US-ryan-high.onnx',
tokens: 'vits-piper-en_US-ryan-high/tokens.txt',
dataDir: 'vits-piper-en_US-ryan-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
lexicon: "",
tokens: "vits-piper-en_US-ryan-high/tokens.txt",
dataDir: "vits-piper-en_US-ryan-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ryan-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ryan-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
tokens = "vits-piper-en_US-ryan-high/tokens.txt",
dataDir = "vits-piper-en_US-ryan-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-ryan-high/en_US-ryan-high.onnx");
vits.setTokens("vits-piper-en_US-ryan-high/tokens.txt");
vits.setDataDir("vits-piper-en_US-ryan-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-ryan-high/en_US-ryan-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-ryan-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-ryan-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-ryan-high/en_US-ryan-high.onnx",
Tokens: "vits-piper-en_US-ryan-high/tokens.txt",
DataDir: "vits-piper-en_US-ryan-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-ryan-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ryan/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-low.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-ryan-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ryan-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-ryan-low.tar.bz2
You can use the following code to play with vits-piper-en_US-ryan-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
data_dir="vits-piper-en_US-ryan-low/espeak-ng-data",
tokens="vits-piper-en_US-ryan-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx";
config.model.vits.tokens = "vits-piper-en_US-ryan-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ryan-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-ryan-low/en_US-ryan-low.onnx".into()),
tokens: Some("vits-piper-en_US-ryan-low/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-ryan-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-ryan-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-ryan-low/en_US-ryan-low.onnx',
tokens: 'vits-piper-en_US-ryan-low/tokens.txt',
dataDir: 'vits-piper-en_US-ryan-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-ryan-low/en_US-ryan-low.onnx',
tokens: 'vits-piper-en_US-ryan-low/tokens.txt',
dataDir: 'vits-piper-en_US-ryan-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
lexicon: "",
tokens: "vits-piper-en_US-ryan-low/tokens.txt",
dataDir: "vits-piper-en_US-ryan-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ryan-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ryan-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
tokens = "vits-piper-en_US-ryan-low/tokens.txt",
dataDir = "vits-piper-en_US-ryan-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-ryan-low/en_US-ryan-low.onnx");
vits.setTokens("vits-piper-en_US-ryan-low/tokens.txt");
vits.setDataDir("vits-piper-en_US-ryan-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-ryan-low/en_US-ryan-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-ryan-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-ryan-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-ryan-low/en_US-ryan-low.onnx",
Tokens: "vits-piper-en_US-ryan-low/tokens.txt",
DataDir: "vits-piper-en_US-ryan-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-ryan-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/ryan/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-ryan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ryan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-ryan-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
data_dir="vits-piper-en_US-ryan-medium/espeak-ng-data",
tokens="vits-piper-en_US-ryan-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-ryan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-ryan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx".into()),
tokens: Some("vits-piper-en_US-ryan-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-ryan-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-ryan-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx',
tokens: 'vits-piper-en_US-ryan-medium/tokens.txt',
dataDir: 'vits-piper-en_US-ryan-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx',
tokens: 'vits-piper-en_US-ryan-medium/tokens.txt',
dataDir: 'vits-piper-en_US-ryan-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-ryan-medium/tokens.txt",
dataDir: "vits-piper-en_US-ryan-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-ryan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-ryan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
tokens = "vits-piper-en_US-ryan-medium/tokens.txt",
dataDir = "vits-piper-en_US-ryan-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx");
vits.setTokens("vits-piper-en_US-ryan-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-ryan-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-ryan-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-ryan-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-ryan-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-ryan-medium/en_US-ryan-medium.onnx",
Tokens: "vits-piper-en_US-ryan-medium/tokens.txt",
DataDir: "vits-piper-en_US-ryan-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-en_US-sam-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/sam/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-sam-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-sam-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-en_US-sam-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
data_dir="vits-piper-en_US-sam-medium/espeak-ng-data",
tokens="vits-piper-en_US-sam-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx";
config.model.vits.tokens = "vits-piper-en_US-sam-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-en_US-sam-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-en_US-sam-medium/en_US-sam-medium.onnx".into()),
tokens: Some("vits-piper-en_US-sam-medium/tokens.txt".into()),
data_dir: Some("vits-piper-en_US-sam-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-en_US-sam-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-en_US-sam-medium/en_US-sam-medium.onnx',
tokens: 'vits-piper-en_US-sam-medium/tokens.txt',
dataDir: 'vits-piper-en_US-sam-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-en_US-sam-medium/en_US-sam-medium.onnx',
tokens: 'vits-piper-en_US-sam-medium/tokens.txt',
dataDir: 'vits-piper-en_US-sam-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
lexicon: "",
tokens: "vits-piper-en_US-sam-medium/tokens.txt",
dataDir: "vits-piper-en_US-sam-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-en_US-sam-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-en_US-sam-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
tokens = "vits-piper-en_US-sam-medium/tokens.txt",
dataDir = "vits-piper-en_US-sam-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-en_US-sam-medium/en_US-sam-medium.onnx");
vits.setTokens("vits-piper-en_US-sam-medium/tokens.txt");
vits.setDataDir("vits-piper-en_US-sam-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-en_US-sam-medium/en_US-sam-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-en_US-sam-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-en_US-sam-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-en_US-sam-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-en_US-sam-medium/en_US-sam-medium.onnx",
Tokens: "vits-piper-en_US-sam-medium/tokens.txt",
DataDir: "vits-piper-en_US-sam-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below:
Speaker 0
Estonian
This section lists text to speech models for Estonian.
supertonic-3-et
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Estonian (et).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "et"
audio = tts.generate("See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"et\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "et"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "et"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'et'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'et'},
);
final audio = tts.generateWithConfig(text: 'See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "et"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"et\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "et"),
)
val audio = tts.generateWithConfigAndCallback(
text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"et\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "et"}';
Audio := Tts.GenerateWithConfig('See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "et"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 1
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 2
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 3
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 4
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 5
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 6
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 7
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 8
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Speaker 9
0
Tere maailm.
1
Kuidas sul täna läheb?
2
Taevas on sinine ja tuul on vaikne.
3
Masinõpe aitab arvutitel andmetest õppida.
4
Kõnesüntees muudab teksti selgeks heliks.
5
Õpilased lugesid raamatukogus lühikest lugu.
6
Rong hilines rööbaste hoolduse tõttu.
7
Väikesed mudelid töötavad kiiresti kohalikes seadmetes.
8
Häälassistent aitab igapäevaste ülesannetega.
9
Stabiilne lugemine on tähtis nii lühikeste kui pikkade lausete jaoks.
Finnish
This section lists text to speech models for Finnish.
vits-piper-fi_FI-harri-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fi/fi_FI/harri/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx";
config.model.vits.tokens = "vits-piper-fi_FI-harri-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-fi_FI-harri-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fi_FI-harri-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
data_dir="vits-piper-fi_FI-harri-low/espeak-ng-data",
tokens="vits-piper-fi_FI-harri-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx";
config.model.vits.tokens = "vits-piper-fi_FI-harri-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-fi_FI-harri-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx".into()),
tokens: Some("vits-piper-fi_FI-harri-low/tokens.txt".into()),
data_dir: Some("vits-piper-fi_FI-harri-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fi_FI-harri-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx',
tokens: 'vits-piper-fi_FI-harri-low/tokens.txt',
dataDir: 'vits-piper-fi_FI-harri-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx',
tokens: 'vits-piper-fi_FI-harri-low/tokens.txt',
dataDir: 'vits-piper-fi_FI-harri-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
lexicon: "",
tokens: "vits-piper-fi_FI-harri-low/tokens.txt",
dataDir: "vits-piper-fi_FI-harri-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx";
config.Model.Vits.Tokens = "vits-piper-fi_FI-harri-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fi_FI-harri-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
tokens = "vits-piper-fi_FI-harri-low/tokens.txt",
dataDir = "vits-piper-fi_FI-harri-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx");
vits.setTokens("vits-piper-fi_FI-harri-low/tokens.txt");
vits.setDataDir("vits-piper-fi_FI-harri-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fi_FI-harri-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fi_FI-harri-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fi_FI-harri-low/fi_FI-harri-low.onnx",
Tokens: "vits-piper-fi_FI-harri-low/tokens.txt",
DataDir: "vits-piper-fi_FI-harri-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fi_FI-harri-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fi/fi_FI/harri/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx";
config.model.vits.tokens = "vits-piper-fi_FI-harri-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fi_FI-harri-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fi_FI-harri-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
data_dir="vits-piper-fi_FI-harri-medium/espeak-ng-data",
tokens="vits-piper-fi_FI-harri-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx";
config.model.vits.tokens = "vits-piper-fi_FI-harri-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fi_FI-harri-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx".into()),
tokens: Some("vits-piper-fi_FI-harri-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fi_FI-harri-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fi_FI-harri-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx',
tokens: 'vits-piper-fi_FI-harri-medium/tokens.txt',
dataDir: 'vits-piper-fi_FI-harri-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx',
tokens: 'vits-piper-fi_FI-harri-medium/tokens.txt',
dataDir: 'vits-piper-fi_FI-harri-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
lexicon: "",
tokens: "vits-piper-fi_FI-harri-medium/tokens.txt",
dataDir: "vits-piper-fi_FI-harri-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fi_FI-harri-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fi_FI-harri-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
tokens = "vits-piper-fi_FI-harri-medium/tokens.txt",
dataDir = "vits-piper-fi_FI-harri-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx");
vits.setTokens("vits-piper-fi_FI-harri-medium/tokens.txt");
vits.setDataDir("vits-piper-fi_FI-harri-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fi_FI-harri-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fi_FI-harri-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fi_FI-harri-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fi_FI-harri-medium/fi_FI-harri-medium.onnx",
Tokens: "vits-piper-fi_FI-harri-medium/tokens.txt",
DataDir: "vits-piper-fi_FI-harri-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-fi
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Finnish (fi).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "fi"
audio = tts.generate("Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"fi\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "fi"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "fi"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'fi'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'fi'},
);
final audio = tts.generateWithConfig(text: 'Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "fi"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"fi\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "fi"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"fi\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "fi"}';
Audio := Tts.GenerateWithConfig('Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "fi"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 1
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 2
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 3
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 4
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 5
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 6
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 7
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 8
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
Speaker 9
0
Hei maailma.
1
Miten voit tänään?
2
Taivas on sininen ja tuuli on lempeä.
3
Koneoppiminen auttaa tietokoneita oppimaan datasta.
4
Puhesynteesi muuttaa tekstin selkeäksi ääneksi.
5
Oppilaat lukivat lyhyen tarinan kirjastossa.
6
Juna myöhästyi raiteiden huollon vuoksi.
7
Pienet mallit toimivat nopeasti paikallisilla laitteilla.
8
Ääniavustaja auttaa päivittäisissä tehtävissä.
9
Vakaa lukeminen on tärkeää sekä lyhyille että pitkille lauseille.
French
This section lists text to speech models for French.
vits-piper-fr_FR-gilles-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/gilles/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-gilles-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-gilles-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-gilles-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
data_dir="vits-piper-fr_FR-gilles-low/espeak-ng-data",
tokens="vits-piper-fr_FR-gilles-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-gilles-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-gilles-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx".into()),
tokens: Some("vits-piper-fr_FR-gilles-low/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-gilles-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-gilles-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx',
tokens: 'vits-piper-fr_FR-gilles-low/tokens.txt',
dataDir: 'vits-piper-fr_FR-gilles-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx',
tokens: 'vits-piper-fr_FR-gilles-low/tokens.txt',
dataDir: 'vits-piper-fr_FR-gilles-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-gilles-low/tokens.txt",
dataDir: "vits-piper-fr_FR-gilles-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-gilles-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-gilles-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
tokens = "vits-piper-fr_FR-gilles-low/tokens.txt",
dataDir = "vits-piper-fr_FR-gilles-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx");
vits.setTokens("vits-piper-fr_FR-gilles-low/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-gilles-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-gilles-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-gilles-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-gilles-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-gilles-low/fr_FR-gilles-low.onnx",
Tokens: "vits-piper-fr_FR-gilles-low/tokens.txt",
DataDir: "vits-piper-fr_FR-gilles-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_fr-FR_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
data_dir="vits-piper-fr_FR-miro-high/espeak-ng-data",
tokens="vits-piper-fr_FR-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx".into()),
tokens: Some("vits-piper-fr_FR-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx',
tokens: 'vits-piper-fr_FR-miro-high/tokens.txt',
dataDir: 'vits-piper-fr_FR-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx',
tokens: 'vits-piper-fr_FR-miro-high/tokens.txt',
dataDir: 'vits-piper-fr_FR-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-miro-high/tokens.txt",
dataDir: "vits-piper-fr_FR-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
tokens = "vits-piper-fr_FR-miro-high/tokens.txt",
dataDir = "vits-piper-fr_FR-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx");
vits.setTokens("vits-piper-fr_FR-miro-high/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-miro-high/fr_FR-miro-high.onnx",
Tokens: "vits-piper-fr_FR-miro-high/tokens.txt",
DataDir: "vits-piper-fr_FR-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-siwis-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/siwis/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-siwis-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-siwis-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-siwis-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
data_dir="vits-piper-fr_FR-siwis-low/espeak-ng-data",
tokens="vits-piper-fr_FR-siwis-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-siwis-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-siwis-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx".into()),
tokens: Some("vits-piper-fr_FR-siwis-low/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-siwis-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-siwis-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx',
tokens: 'vits-piper-fr_FR-siwis-low/tokens.txt',
dataDir: 'vits-piper-fr_FR-siwis-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx',
tokens: 'vits-piper-fr_FR-siwis-low/tokens.txt',
dataDir: 'vits-piper-fr_FR-siwis-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-siwis-low/tokens.txt",
dataDir: "vits-piper-fr_FR-siwis-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-siwis-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-siwis-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
tokens = "vits-piper-fr_FR-siwis-low/tokens.txt",
dataDir = "vits-piper-fr_FR-siwis-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx");
vits.setTokens("vits-piper-fr_FR-siwis-low/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-siwis-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-siwis-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-siwis-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-siwis-low/fr_FR-siwis-low.onnx",
Tokens: "vits-piper-fr_FR-siwis-low/tokens.txt",
DataDir: "vits-piper-fr_FR-siwis-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-siwis-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/siwis/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-siwis-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
data_dir="vits-piper-fr_FR-siwis-medium/espeak-ng-data",
tokens="vits-piper-fr_FR-siwis-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx".into()),
tokens: Some("vits-piper-fr_FR-siwis-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-siwis-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-siwis-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx',
tokens: 'vits-piper-fr_FR-siwis-medium/tokens.txt',
dataDir: 'vits-piper-fr_FR-siwis-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx',
tokens: 'vits-piper-fr_FR-siwis-medium/tokens.txt',
dataDir: 'vits-piper-fr_FR-siwis-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-siwis-medium/tokens.txt",
dataDir: "vits-piper-fr_FR-siwis-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
tokens = "vits-piper-fr_FR-siwis-medium/tokens.txt",
dataDir = "vits-piper-fr_FR-siwis-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx");
vits.setTokens("vits-piper-fr_FR-siwis-medium/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-siwis-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-siwis-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-siwis-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-siwis-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-siwis-medium/fr_FR-siwis-medium.onnx",
Tokens: "vits-piper-fr_FR-siwis-medium/tokens.txt",
DataDir: "vits-piper-fr_FR-siwis-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-tjiho-model1
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model1/tree/main
| Number of speakers | Sample rate |
|---|---|
| 1 | 44100 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-tjiho-model1
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
data_dir="vits-piper-fr_FR-tjiho-model1/espeak-ng-data",
tokens="vits-piper-fr_FR-tjiho-model1/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx".into()),
tokens: Some("vits-piper-fr_FR-tjiho-model1/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-tjiho-model1/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx',
tokens: 'vits-piper-fr_FR-tjiho-model1/tokens.txt',
dataDir: 'vits-piper-fr_FR-tjiho-model1/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx',
tokens: 'vits-piper-fr_FR-tjiho-model1/tokens.txt',
dataDir: 'vits-piper-fr_FR-tjiho-model1/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-tjiho-model1/tokens.txt",
dataDir: "vits-piper-fr_FR-tjiho-model1/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
tokens = "vits-piper-fr_FR-tjiho-model1/tokens.txt",
dataDir = "vits-piper-fr_FR-tjiho-model1/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx");
vits.setTokens("vits-piper-fr_FR-tjiho-model1/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-tjiho-model1/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tjiho-model1/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tjiho-model1/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model1 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-tjiho-model1/fr_FR-tjiho-model1.onnx",
Tokens: "vits-piper-fr_FR-tjiho-model1/tokens.txt",
DataDir: "vits-piper-fr_FR-tjiho-model1/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-tjiho-model2
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model2/tree/main
| Number of speakers | Sample rate |
|---|---|
| 1 | 44100 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-tjiho-model2
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
data_dir="vits-piper-fr_FR-tjiho-model2/espeak-ng-data",
tokens="vits-piper-fr_FR-tjiho-model2/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx".into()),
tokens: Some("vits-piper-fr_FR-tjiho-model2/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-tjiho-model2/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx',
tokens: 'vits-piper-fr_FR-tjiho-model2/tokens.txt',
dataDir: 'vits-piper-fr_FR-tjiho-model2/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx',
tokens: 'vits-piper-fr_FR-tjiho-model2/tokens.txt',
dataDir: 'vits-piper-fr_FR-tjiho-model2/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-tjiho-model2/tokens.txt",
dataDir: "vits-piper-fr_FR-tjiho-model2/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
tokens = "vits-piper-fr_FR-tjiho-model2/tokens.txt",
dataDir = "vits-piper-fr_FR-tjiho-model2/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx");
vits.setTokens("vits-piper-fr_FR-tjiho-model2/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-tjiho-model2/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tjiho-model2/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tjiho-model2/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model2 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-tjiho-model2/fr_FR-tjiho-model2.onnx",
Tokens: "vits-piper-fr_FR-tjiho-model2/tokens.txt",
DataDir: "vits-piper-fr_FR-tjiho-model2/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-tjiho-model3
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model3/tree/main
| Number of speakers | Sample rate |
|---|---|
| 1 | 44100 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-tjiho-model3
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
data_dir="vits-piper-fr_FR-tjiho-model3/espeak-ng-data",
tokens="vits-piper-fr_FR-tjiho-model3/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx".into()),
tokens: Some("vits-piper-fr_FR-tjiho-model3/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-tjiho-model3/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx',
tokens: 'vits-piper-fr_FR-tjiho-model3/tokens.txt',
dataDir: 'vits-piper-fr_FR-tjiho-model3/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx',
tokens: 'vits-piper-fr_FR-tjiho-model3/tokens.txt',
dataDir: 'vits-piper-fr_FR-tjiho-model3/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-tjiho-model3/tokens.txt",
dataDir: "vits-piper-fr_FR-tjiho-model3/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
tokens = "vits-piper-fr_FR-tjiho-model3/tokens.txt",
dataDir = "vits-piper-fr_FR-tjiho-model3/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx");
vits.setTokens("vits-piper-fr_FR-tjiho-model3/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-tjiho-model3/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tjiho-model3/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tjiho-model3/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tjiho-model3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-tjiho-model3/fr_FR-tjiho-model3.onnx",
Tokens: "vits-piper-fr_FR-tjiho-model3/tokens.txt",
DataDir: "vits-piper-fr_FR-tjiho-model3/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-tom-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/tom/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 44100 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tom-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tom-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-tom-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
data_dir="vits-piper-fr_FR-tom-medium/espeak-ng-data",
tokens="vits-piper-fr_FR-tom-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-tom-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-tom-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx".into()),
tokens: Some("vits-piper-fr_FR-tom-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-tom-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-tom-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx',
tokens: 'vits-piper-fr_FR-tom-medium/tokens.txt',
dataDir: 'vits-piper-fr_FR-tom-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx',
tokens: 'vits-piper-fr_FR-tom-medium/tokens.txt',
dataDir: 'vits-piper-fr_FR-tom-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-tom-medium/tokens.txt",
dataDir: "vits-piper-fr_FR-tom-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-tom-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-tom-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
tokens = "vits-piper-fr_FR-tom-medium/tokens.txt",
dataDir = "vits-piper-fr_FR-tom-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx");
vits.setTokens("vits-piper-fr_FR-tom-medium/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-tom-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-tom-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-tom-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-tom-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-tom-medium/fr_FR-tom-medium.onnx",
Tokens: "vits-piper-fr_FR-tom-medium/tokens.txt",
DataDir: "vits-piper-fr_FR-tom-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fr_FR-upmc-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fr/fr_FR/upmc/medium
| Number of speakers | Sample rate |
|---|---|
| 2 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Pas de nouvelles, bonnes nouvelles.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fr_FR-upmc-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
data_dir="vits-piper-fr_FR-upmc-medium/espeak-ng-data",
tokens="vits-piper-fr_FR-upmc-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Pas de nouvelles, bonnes nouvelles.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx";
config.model.vits.tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Pas de nouvelles, bonnes nouvelles.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx".into()),
tokens: Some("vits-piper-fr_FR-upmc-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fr_FR-upmc-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Pas de nouvelles, bonnes nouvelles.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fr_FR-upmc-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx',
tokens: 'vits-piper-fr_FR-upmc-medium/tokens.txt',
dataDir: 'vits-piper-fr_FR-upmc-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Pas de nouvelles, bonnes nouvelles.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx',
tokens: 'vits-piper-fr_FR-upmc-medium/tokens.txt',
dataDir: 'vits-piper-fr_FR-upmc-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Pas de nouvelles, bonnes nouvelles.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
lexicon: "",
tokens: "vits-piper-fr_FR-upmc-medium/tokens.txt",
dataDir: "vits-piper-fr_FR-upmc-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Pas de nouvelles, bonnes nouvelles."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
tokens = "vits-piper-fr_FR-upmc-medium/tokens.txt",
dataDir = "vits-piper-fr_FR-upmc-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Pas de nouvelles, bonnes nouvelles.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx");
vits.setTokens("vits-piper-fr_FR-upmc-medium/tokens.txt");
vits.setDataDir("vits-piper-fr_FR-upmc-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Pas de nouvelles, bonnes nouvelles.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fr_FR-upmc-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fr_FR-upmc-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Pas de nouvelles, bonnes nouvelles.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fr_FR-upmc-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fr_FR-upmc-medium/fr_FR-upmc-medium.onnx",
Tokens: "vits-piper-fr_FR-upmc-medium/tokens.txt",
DataDir: "vits-piper-fr_FR-upmc-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Pas de nouvelles, bonnes nouvelles."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Pas de nouvelles, bonnes nouvelles.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
supertonic-3-fr
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for French (fr).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "fr"
audio = tts.generate("Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"fr\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "fr"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "fr"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'fr'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'fr'},
);
final audio = tts.generateWithConfig(text: 'Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "fr"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"fr\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "fr"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"fr\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "fr"}';
Audio := Tts.GenerateWithConfig('Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "fr"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 1
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 2
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 3
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 4
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 5
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 6
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 7
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 8
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Speaker 9
0
Bonjour le monde.
1
Comment allez-vous aujourd’hui?
2
Le ciel est bleu.
3
J’aime l’apprentissage automatique.
4
Python est incroyable.
5
Bonjour à tous.
6
L’intelligence artificielle grandit.
7
La synthèse vocale est fascinante.
8
Les réseaux neuronaux sont puissants.
9
Le texte en voix convertit le texte en audio.
10
Le rapide renard brun saute par-dessus le chien paresseux.
11
L’apprentissage automatique permet aux ordinateurs d’apprendre.
12
Le traitement du langage naturel aide les machines à comprendre.
13
L’apprentissage profond a révolutionné l’intelligence artificielle.
14
La technologie de synthèse vocale a considérablement progressé.
15
Le clonage vocal neuronal peut reproduire les styles de parole.
16
La normalisation du texte est importante pour la prononciation.
17
Les assistants vocaux nous aident à interagir avec la technologie.
18
Les systèmes TTS modernes utilisent l’apprentissage profond.
19
L’interaction homme machine est devenue plus intuitive.
Georgian
This section lists text to speech models for Georgian.
vits-piper-ka_GE-natia-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ka/ka_GE/natia/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx";
config.model.vits.tokens = "vits-piper-ka_GE-natia-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ka_GE-natia-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "ღვინო თბილისში, საქართველო სამტრედში";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ka_GE-natia-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
data_dir="vits-piper-ka_GE-natia-medium/espeak-ng-data",
tokens="vits-piper-ka_GE-natia-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="ღვინო თბილისში, საქართველო სამტრედში",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx";
config.model.vits.tokens = "vits-piper-ka_GE-natia-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ka_GE-natia-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "ღვინო თბილისში, საქართველო სამტრედში";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx".into()),
tokens: Some("vits-piper-ka_GE-natia-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ka_GE-natia-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "ღვინო თბილისში, საქართველო სამტრედში";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ka_GE-natia-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx',
tokens: 'vits-piper-ka_GE-natia-medium/tokens.txt',
dataDir: 'vits-piper-ka_GE-natia-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'ღვინო თბილისში, საქართველო სამტრედში';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx',
tokens: 'vits-piper-ka_GE-natia-medium/tokens.txt',
dataDir: 'vits-piper-ka_GE-natia-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'ღვინო თბილისში, საქართველო სამტრედში', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
lexicon: "",
tokens: "vits-piper-ka_GE-natia-medium/tokens.txt",
dataDir: "vits-piper-ka_GE-natia-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "ღვინო თბილისში, საქართველო სამტრედში"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ka_GE-natia-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ka_GE-natia-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "ღვინო თბილისში, საქართველო სამტრედში";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
tokens = "vits-piper-ka_GE-natia-medium/tokens.txt",
dataDir = "vits-piper-ka_GE-natia-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "ღვინო თბილისში, საქართველო სამტრედში",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx");
vits.setTokens("vits-piper-ka_GE-natia-medium/tokens.txt");
vits.setDataDir("vits-piper-ka_GE-natia-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "ღვინო თბილისში, საქართველო სამტრედში";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ka_GE-natia-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ka_GE-natia-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('ღვინო თბილისში, საქართველო სამტრედში', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ka_GE-natia-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ka_GE-natia-medium/ka_GE-natia-medium.onnx",
Tokens: "vits-piper-ka_GE-natia-medium/tokens.txt",
DataDir: "vits-piper-ka_GE-natia-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "ღვინო თბილისში, საქართველო სამტრედში"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
ღვინო თბილისში, საქართველო სამტრედში
sample audios for different speakers are listed below:
Speaker 0
German
This section lists text to speech models for German.
vits-piper-de_DE-dii-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_de-DE_dii
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-dii-high.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-dii-high.tar.bz2
You can use the following code to play with vits-piper-de_DE-dii-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
data_dir="vits-piper-de_DE-dii-high/espeak-ng-data",
tokens="vits-piper-de_DE-dii-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-dii-high/de_DE-dii-high.onnx".into()),
tokens: Some("vits-piper-de_DE-dii-high/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-dii-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-dii-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-dii-high/de_DE-dii-high.onnx',
tokens: 'vits-piper-de_DE-dii-high/tokens.txt',
dataDir: 'vits-piper-de_DE-dii-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-dii-high/de_DE-dii-high.onnx',
tokens: 'vits-piper-de_DE-dii-high/tokens.txt',
dataDir: 'vits-piper-de_DE-dii-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-dii-high/tokens.txt",
dataDir: "vits-piper-de_DE-dii-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
tokens = "vits-piper-de_DE-dii-high/tokens.txt",
dataDir = "vits-piper-de_DE-dii-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-dii-high/de_DE-dii-high.onnx");
vits.setTokens("vits-piper-de_DE-dii-high/tokens.txt");
vits.setDataDir("vits-piper-de_DE-dii-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-dii-high/de_DE-dii-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-dii-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-dii-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-dii-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-dii-high/de_DE-dii-high.onnx",
Tokens: "vits-piper-de_DE-dii-high/tokens.txt",
DataDir: "vits-piper-de_DE-dii-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-eva_k-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/eva_k/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-eva_k-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
data_dir="vits-piper-de_DE-eva_k-x_low/espeak-ng-data",
tokens="vits-piper-de_DE-eva_k-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx".into()),
tokens: Some("vits-piper-de_DE-eva_k-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-eva_k-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx',
tokens: 'vits-piper-de_DE-eva_k-x_low/tokens.txt',
dataDir: 'vits-piper-de_DE-eva_k-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx',
tokens: 'vits-piper-de_DE-eva_k-x_low/tokens.txt',
dataDir: 'vits-piper-de_DE-eva_k-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-eva_k-x_low/tokens.txt",
dataDir: "vits-piper-de_DE-eva_k-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
tokens = "vits-piper-de_DE-eva_k-x_low/tokens.txt",
dataDir = "vits-piper-de_DE-eva_k-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx");
vits.setTokens("vits-piper-de_DE-eva_k-x_low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-eva_k-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-eva_k-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-eva_k-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-eva_k-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-eva_k-x_low/de_DE-eva_k-x_low.onnx",
Tokens: "vits-piper-de_DE-eva_k-x_low/tokens.txt",
DataDir: "vits-piper-de_DE-eva_k-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-glados-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/systemofapwne/piper-de-glados
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-glados-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
data_dir="vits-piper-de_DE-glados-high/espeak-ng-data",
tokens="vits-piper-de_DE-glados-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-glados-high/de_DE-glados-high.onnx".into()),
tokens: Some("vits-piper-de_DE-glados-high/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-glados-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-glados-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-glados-high/de_DE-glados-high.onnx',
tokens: 'vits-piper-de_DE-glados-high/tokens.txt',
dataDir: 'vits-piper-de_DE-glados-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-glados-high/de_DE-glados-high.onnx',
tokens: 'vits-piper-de_DE-glados-high/tokens.txt',
dataDir: 'vits-piper-de_DE-glados-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-glados-high/tokens.txt",
dataDir: "vits-piper-de_DE-glados-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
tokens = "vits-piper-de_DE-glados-high/tokens.txt",
dataDir = "vits-piper-de_DE-glados-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-glados-high/de_DE-glados-high.onnx");
vits.setTokens("vits-piper-de_DE-glados-high/tokens.txt");
vits.setDataDir("vits-piper-de_DE-glados-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-glados-high/de_DE-glados-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-glados-high/de_DE-glados-high.onnx",
Tokens: "vits-piper-de_DE-glados-high/tokens.txt",
DataDir: "vits-piper-de_DE-glados-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-glados-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/systemofapwne/piper-de-glados
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-glados-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
data_dir="vits-piper-de_DE-glados-low/espeak-ng-data",
tokens="vits-piper-de_DE-glados-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-glados-low/de_DE-glados-low.onnx".into()),
tokens: Some("vits-piper-de_DE-glados-low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-glados-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-glados-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-glados-low/de_DE-glados-low.onnx',
tokens: 'vits-piper-de_DE-glados-low/tokens.txt',
dataDir: 'vits-piper-de_DE-glados-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-glados-low/de_DE-glados-low.onnx',
tokens: 'vits-piper-de_DE-glados-low/tokens.txt',
dataDir: 'vits-piper-de_DE-glados-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-glados-low/tokens.txt",
dataDir: "vits-piper-de_DE-glados-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
tokens = "vits-piper-de_DE-glados-low/tokens.txt",
dataDir = "vits-piper-de_DE-glados-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-glados-low/de_DE-glados-low.onnx");
vits.setTokens("vits-piper-de_DE-glados-low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-glados-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-glados-low/de_DE-glados-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-glados-low/de_DE-glados-low.onnx",
Tokens: "vits-piper-de_DE-glados-low/tokens.txt",
DataDir: "vits-piper-de_DE-glados-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-glados-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/systemofapwne/piper-de-glados
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-glados-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
data_dir="vits-piper-de_DE-glados-medium/espeak-ng-data",
tokens="vits-piper-de_DE-glados-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx".into()),
tokens: Some("vits-piper-de_DE-glados-medium/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-glados-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-glados-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx',
tokens: 'vits-piper-de_DE-glados-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-glados-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx',
tokens: 'vits-piper-de_DE-glados-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-glados-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-glados-medium/tokens.txt",
dataDir: "vits-piper-de_DE-glados-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
tokens = "vits-piper-de_DE-glados-medium/tokens.txt",
dataDir = "vits-piper-de_DE-glados-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx");
vits.setTokens("vits-piper-de_DE-glados-medium/tokens.txt");
vits.setDataDir("vits-piper-de_DE-glados-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-glados-medium/de_DE-glados-medium.onnx",
Tokens: "vits-piper-de_DE-glados-medium/tokens.txt",
DataDir: "vits-piper-de_DE-glados-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-glados_turret-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/systemofapwne/piper-de-glados
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-glados_turret-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
data_dir="vits-piper-de_DE-glados_turret-high/espeak-ng-data",
tokens="vits-piper-de_DE-glados_turret-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx".into()),
tokens: Some("vits-piper-de_DE-glados_turret-high/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-glados_turret-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-glados_turret-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx',
tokens: 'vits-piper-de_DE-glados_turret-high/tokens.txt',
dataDir: 'vits-piper-de_DE-glados_turret-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx',
tokens: 'vits-piper-de_DE-glados_turret-high/tokens.txt',
dataDir: 'vits-piper-de_DE-glados_turret-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-glados_turret-high/tokens.txt",
dataDir: "vits-piper-de_DE-glados_turret-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
tokens = "vits-piper-de_DE-glados_turret-high/tokens.txt",
dataDir = "vits-piper-de_DE-glados_turret-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx");
vits.setTokens("vits-piper-de_DE-glados_turret-high/tokens.txt");
vits.setDataDir("vits-piper-de_DE-glados_turret-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados_turret-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados_turret-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-glados_turret-high/de_DE-glados_turret-high.onnx",
Tokens: "vits-piper-de_DE-glados_turret-high/tokens.txt",
DataDir: "vits-piper-de_DE-glados_turret-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-glados_turret-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/systemofapwne/piper-de-glados
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-glados_turret-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
data_dir="vits-piper-de_DE-glados_turret-low/espeak-ng-data",
tokens="vits-piper-de_DE-glados_turret-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx".into()),
tokens: Some("vits-piper-de_DE-glados_turret-low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-glados_turret-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-glados_turret-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx',
tokens: 'vits-piper-de_DE-glados_turret-low/tokens.txt',
dataDir: 'vits-piper-de_DE-glados_turret-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx',
tokens: 'vits-piper-de_DE-glados_turret-low/tokens.txt',
dataDir: 'vits-piper-de_DE-glados_turret-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-glados_turret-low/tokens.txt",
dataDir: "vits-piper-de_DE-glados_turret-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
tokens = "vits-piper-de_DE-glados_turret-low/tokens.txt",
dataDir = "vits-piper-de_DE-glados_turret-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx");
vits.setTokens("vits-piper-de_DE-glados_turret-low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-glados_turret-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados_turret-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados_turret-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-glados_turret-low/de_DE-glados_turret-low.onnx",
Tokens: "vits-piper-de_DE-glados_turret-low/tokens.txt",
DataDir: "vits-piper-de_DE-glados_turret-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-glados_turret-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/systemofapwne/piper-de-glados
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-glados_turret-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
data_dir="vits-piper-de_DE-glados_turret-medium/espeak-ng-data",
tokens="vits-piper-de_DE-glados_turret-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx".into()),
tokens: Some("vits-piper-de_DE-glados_turret-medium/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-glados_turret-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx',
tokens: 'vits-piper-de_DE-glados_turret-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-glados_turret-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx',
tokens: 'vits-piper-de_DE-glados_turret-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-glados_turret-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-glados_turret-medium/tokens.txt",
dataDir: "vits-piper-de_DE-glados_turret-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
tokens = "vits-piper-de_DE-glados_turret-medium/tokens.txt",
dataDir = "vits-piper-de_DE-glados_turret-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx");
vits.setTokens("vits-piper-de_DE-glados_turret-medium/tokens.txt");
vits.setDataDir("vits-piper-de_DE-glados_turret-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-glados_turret-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-glados_turret-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-glados_turret-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-glados_turret-medium/de_DE-glados_turret-medium.onnx",
Tokens: "vits-piper-de_DE-glados_turret-medium/tokens.txt",
DataDir: "vits-piper-de_DE-glados_turret-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-karlsson-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/karlsson/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-karlsson-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-karlsson-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-karlsson-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
data_dir="vits-piper-de_DE-karlsson-low/espeak-ng-data",
tokens="vits-piper-de_DE-karlsson-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-karlsson-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-karlsson-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx".into()),
tokens: Some("vits-piper-de_DE-karlsson-low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-karlsson-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-karlsson-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx',
tokens: 'vits-piper-de_DE-karlsson-low/tokens.txt',
dataDir: 'vits-piper-de_DE-karlsson-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx',
tokens: 'vits-piper-de_DE-karlsson-low/tokens.txt',
dataDir: 'vits-piper-de_DE-karlsson-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-karlsson-low/tokens.txt",
dataDir: "vits-piper-de_DE-karlsson-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-karlsson-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-karlsson-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
tokens = "vits-piper-de_DE-karlsson-low/tokens.txt",
dataDir = "vits-piper-de_DE-karlsson-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx");
vits.setTokens("vits-piper-de_DE-karlsson-low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-karlsson-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-karlsson-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-karlsson-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-karlsson-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-karlsson-low/de_DE-karlsson-low.onnx",
Tokens: "vits-piper-de_DE-karlsson-low/tokens.txt",
DataDir: "vits-piper-de_DE-karlsson-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-kerstin-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/kerstin/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-kerstin-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-kerstin-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-kerstin-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
data_dir="vits-piper-de_DE-kerstin-low/espeak-ng-data",
tokens="vits-piper-de_DE-kerstin-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-kerstin-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-kerstin-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx".into()),
tokens: Some("vits-piper-de_DE-kerstin-low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-kerstin-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-kerstin-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx',
tokens: 'vits-piper-de_DE-kerstin-low/tokens.txt',
dataDir: 'vits-piper-de_DE-kerstin-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx',
tokens: 'vits-piper-de_DE-kerstin-low/tokens.txt',
dataDir: 'vits-piper-de_DE-kerstin-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-kerstin-low/tokens.txt",
dataDir: "vits-piper-de_DE-kerstin-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-kerstin-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-kerstin-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
tokens = "vits-piper-de_DE-kerstin-low/tokens.txt",
dataDir = "vits-piper-de_DE-kerstin-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx");
vits.setTokens("vits-piper-de_DE-kerstin-low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-kerstin-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-kerstin-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-kerstin-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-kerstin-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-kerstin-low/de_DE-kerstin-low.onnx",
Tokens: "vits-piper-de_DE-kerstin-low/tokens.txt",
DataDir: "vits-piper-de_DE-kerstin-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_de-DE_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
data_dir="vits-piper-de_DE-miro-high/espeak-ng-data",
tokens="vits-piper-de_DE-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-miro-high/de_DE-miro-high.onnx".into()),
tokens: Some("vits-piper-de_DE-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-miro-high/de_DE-miro-high.onnx',
tokens: 'vits-piper-de_DE-miro-high/tokens.txt',
dataDir: 'vits-piper-de_DE-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-miro-high/de_DE-miro-high.onnx',
tokens: 'vits-piper-de_DE-miro-high/tokens.txt',
dataDir: 'vits-piper-de_DE-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-miro-high/tokens.txt",
dataDir: "vits-piper-de_DE-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
tokens = "vits-piper-de_DE-miro-high/tokens.txt",
dataDir = "vits-piper-de_DE-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-miro-high/de_DE-miro-high.onnx");
vits.setTokens("vits-piper-de_DE-miro-high/tokens.txt");
vits.setDataDir("vits-piper-de_DE-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-miro-high/de_DE-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-miro-high/de_DE-miro-high.onnx",
Tokens: "vits-piper-de_DE-miro-high/tokens.txt",
DataDir: "vits-piper-de_DE-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-pavoque-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/pavoque/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-pavoque-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-pavoque-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-pavoque-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
data_dir="vits-piper-de_DE-pavoque-low/espeak-ng-data",
tokens="vits-piper-de_DE-pavoque-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-pavoque-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-pavoque-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx".into()),
tokens: Some("vits-piper-de_DE-pavoque-low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-pavoque-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-pavoque-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx',
tokens: 'vits-piper-de_DE-pavoque-low/tokens.txt',
dataDir: 'vits-piper-de_DE-pavoque-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx',
tokens: 'vits-piper-de_DE-pavoque-low/tokens.txt',
dataDir: 'vits-piper-de_DE-pavoque-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-pavoque-low/tokens.txt",
dataDir: "vits-piper-de_DE-pavoque-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-pavoque-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-pavoque-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
tokens = "vits-piper-de_DE-pavoque-low/tokens.txt",
dataDir = "vits-piper-de_DE-pavoque-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx");
vits.setTokens("vits-piper-de_DE-pavoque-low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-pavoque-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-pavoque-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-pavoque-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-pavoque-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-pavoque-low/de_DE-pavoque-low.onnx",
Tokens: "vits-piper-de_DE-pavoque-low/tokens.txt",
DataDir: "vits-piper-de_DE-pavoque-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-ramona-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/ramona/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-ramona-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-ramona-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-ramona-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
data_dir="vits-piper-de_DE-ramona-low/espeak-ng-data",
tokens="vits-piper-de_DE-ramona-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-ramona-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-ramona-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx".into()),
tokens: Some("vits-piper-de_DE-ramona-low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-ramona-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-ramona-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx',
tokens: 'vits-piper-de_DE-ramona-low/tokens.txt',
dataDir: 'vits-piper-de_DE-ramona-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx',
tokens: 'vits-piper-de_DE-ramona-low/tokens.txt',
dataDir: 'vits-piper-de_DE-ramona-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-ramona-low/tokens.txt",
dataDir: "vits-piper-de_DE-ramona-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-ramona-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-ramona-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
tokens = "vits-piper-de_DE-ramona-low/tokens.txt",
dataDir = "vits-piper-de_DE-ramona-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx");
vits.setTokens("vits-piper-de_DE-ramona-low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-ramona-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-ramona-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-ramona-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-ramona-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-ramona-low/de_DE-ramona-low.onnx",
Tokens: "vits-piper-de_DE-ramona-low/tokens.txt",
DataDir: "vits-piper-de_DE-ramona-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-thorsten-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-thorsten-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
data_dir="vits-piper-de_DE-thorsten-high/espeak-ng-data",
tokens="vits-piper-de_DE-thorsten-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx".into()),
tokens: Some("vits-piper-de_DE-thorsten-high/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-thorsten-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-thorsten-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx',
tokens: 'vits-piper-de_DE-thorsten-high/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx',
tokens: 'vits-piper-de_DE-thorsten-high/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-thorsten-high/tokens.txt",
dataDir: "vits-piper-de_DE-thorsten-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
tokens = "vits-piper-de_DE-thorsten-high/tokens.txt",
dataDir = "vits-piper-de_DE-thorsten-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx");
vits.setTokens("vits-piper-de_DE-thorsten-high/tokens.txt");
vits.setDataDir("vits-piper-de_DE-thorsten-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-thorsten-high/de_DE-thorsten-high.onnx",
Tokens: "vits-piper-de_DE-thorsten-high/tokens.txt",
DataDir: "vits-piper-de_DE-thorsten-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-thorsten-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-thorsten-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
data_dir="vits-piper-de_DE-thorsten-low/espeak-ng-data",
tokens="vits-piper-de_DE-thorsten-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx".into()),
tokens: Some("vits-piper-de_DE-thorsten-low/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-thorsten-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-thorsten-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx',
tokens: 'vits-piper-de_DE-thorsten-low/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx',
tokens: 'vits-piper-de_DE-thorsten-low/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-thorsten-low/tokens.txt",
dataDir: "vits-piper-de_DE-thorsten-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
tokens = "vits-piper-de_DE-thorsten-low/tokens.txt",
dataDir = "vits-piper-de_DE-thorsten-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx");
vits.setTokens("vits-piper-de_DE-thorsten-low/tokens.txt");
vits.setDataDir("vits-piper-de_DE-thorsten-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-thorsten-low/de_DE-thorsten-low.onnx",
Tokens: "vits-piper-de_DE-thorsten-low/tokens.txt",
DataDir: "vits-piper-de_DE-thorsten-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-thorsten-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-thorsten-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
data_dir="vits-piper-de_DE-thorsten-medium/espeak-ng-data",
tokens="vits-piper-de_DE-thorsten-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx".into()),
tokens: Some("vits-piper-de_DE-thorsten-medium/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-thorsten-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-thorsten-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx',
tokens: 'vits-piper-de_DE-thorsten-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx',
tokens: 'vits-piper-de_DE-thorsten-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-thorsten-medium/tokens.txt",
dataDir: "vits-piper-de_DE-thorsten-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
tokens = "vits-piper-de_DE-thorsten-medium/tokens.txt",
dataDir = "vits-piper-de_DE-thorsten-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx");
vits.setTokens("vits-piper-de_DE-thorsten-medium/tokens.txt");
vits.setDataDir("vits-piper-de_DE-thorsten-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-thorsten-medium/de_DE-thorsten-medium.onnx",
Tokens: "vits-piper-de_DE-thorsten-medium/tokens.txt",
DataDir: "vits-piper-de_DE-thorsten-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-de_DE-thorsten_emotional-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_DE/thorsten_emotional/medium
| Number of speakers | Sample rate |
|---|---|
| 8 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Alles hat ein Ende, nur die Wurst hat zwei.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
data_dir="vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data",
tokens="vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Alles hat ein Ende, nur die Wurst hat zwei.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx";
config.model.vits.tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Alles hat ein Ende, nur die Wurst hat zwei.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx".into()),
tokens: Some("vits-piper-de_DE-thorsten_emotional-medium/tokens.txt".into()),
data_dir: Some("vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Alles hat ein Ende, nur die Wurst hat zwei.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx',
tokens: 'vits-piper-de_DE-thorsten_emotional-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Alles hat ein Ende, nur die Wurst hat zwei.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx',
tokens: 'vits-piper-de_DE-thorsten_emotional-medium/tokens.txt',
dataDir: 'vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Alles hat ein Ende, nur die Wurst hat zwei.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
lexicon: "",
tokens: "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
dataDir: "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Alles hat ein Ende, nur die Wurst hat zwei."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
tokens = "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
dataDir = "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Alles hat ein Ende, nur die Wurst hat zwei.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx");
vits.setTokens("vits-piper-de_DE-thorsten_emotional-medium/tokens.txt");
vits.setDataDir("vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Alles hat ein Ende, nur die Wurst hat zwei.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-de_DE-thorsten_emotional-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Alles hat ein Ende, nur die Wurst hat zwei.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-de_DE-thorsten_emotional-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-de_DE-thorsten_emotional-medium/de_DE-thorsten_emotional-medium.onnx",
Tokens: "vits-piper-de_DE-thorsten_emotional-medium/tokens.txt",
DataDir: "vits-piper-de_DE-thorsten_emotional-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Alles hat ein Ende, nur die Wurst hat zwei."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Alles hat ein Ende, nur die Wurst hat zwei.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
supertonic-3-de
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for German (de).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "de"
audio = tts.generate("Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"de\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "de"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "de"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'de'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'de'},
);
final audio = tts.generateWithConfig(text: 'Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "de"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"de\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "de"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"de\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "de"}';
Audio := Tts.GenerateWithConfig('Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "de"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 1
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 2
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 3
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 4
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 5
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 6
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 7
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 8
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Speaker 9
0
Hallo Welt.
1
Wie geht es dir heute?
2
Der Himmel ist blau und der Wind ist mild.
3
Maschinelles Lernen hilft Computern, aus Daten zu lernen.
4
Sprachsynthese wandelt Text in klare Sprache um.
5
Die Schüler lasen am Morgen eine kurze Geschichte.
6
Der Zug hatte wegen Wartungsarbeiten Verspätung.
7
Kleine Modelle laufen schnell auf lokalen Geräten.
8
Ein Sprachassistent hilft bei alltäglichen Aufgaben.
9
Stabiles Vorlesen ist für kurze und lange Texte wichtig.
Greek
This section lists text to speech models for Greek.
vits-piper-el_GR-rapunzelina-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/el/el_GR/rapunzelina/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx";
config.model.vits.tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-el_GR-rapunzelina-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
data_dir="vits-piper-el_GR-rapunzelina-low/espeak-ng-data",
tokens="vits-piper-el_GR-rapunzelina-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx";
config.model.vits.tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx".into()),
tokens: Some("vits-piper-el_GR-rapunzelina-low/tokens.txt".into()),
data_dir: Some("vits-piper-el_GR-rapunzelina-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx',
tokens: 'vits-piper-el_GR-rapunzelina-low/tokens.txt',
dataDir: 'vits-piper-el_GR-rapunzelina-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx',
tokens: 'vits-piper-el_GR-rapunzelina-low/tokens.txt',
dataDir: 'vits-piper-el_GR-rapunzelina-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
lexicon: "",
tokens: "vits-piper-el_GR-rapunzelina-low/tokens.txt",
dataDir: "vits-piper-el_GR-rapunzelina-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx";
config.Model.Vits.Tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
tokens = "vits-piper-el_GR-rapunzelina-low/tokens.txt",
dataDir = "vits-piper-el_GR-rapunzelina-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx");
vits.setTokens("vits-piper-el_GR-rapunzelina-low/tokens.txt");
vits.setDataDir("vits-piper-el_GR-rapunzelina-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-el_GR-rapunzelina-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-el_GR-rapunzelina-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-el_GR-rapunzelina-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-el_GR-rapunzelina-low/el_GR-rapunzelina-low.onnx",
Tokens: "vits-piper-el_GR-rapunzelina-low/tokens.txt",
DataDir: "vits-piper-el_GR-rapunzelina-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-el
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Greek (el).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "el"
audio = tts.generate("Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"el\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "el"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "el"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'el'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'el'},
);
final audio = tts.generateWithConfig(text: 'Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "el"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"el\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "el"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"el\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "el"}';
Audio := Tts.GenerateWithConfig('Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "el"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 1
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 2
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 3
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 4
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 5
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 6
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 7
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 8
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Speaker 9
0
Γεια σου κόσμε.
1
Πώς είσαι σήμερα;
2
Ο ουρανός είναι γαλάζιος και ο άνεμος ήρεμος.
3
Η μηχανική μάθηση βοηθά τους υπολογιστές να μαθαίνουν από δεδομένα.
4
Η σύνθεση ομιλίας μετατρέπει το κείμενο σε καθαρό ήχο.
5
Οι μαθητές διάβασαν μια μικρή ιστορία στη βιβλιοθήκη.
6
Το τρένο καθυστέρησε λόγω εργασιών συντήρησης.
7
Τα μικρά μοντέλα λειτουργούν γρήγορα σε τοπικές συσκευές.
8
Ο φωνητικός βοηθός διευκολύνει τις καθημερινές εργασίες.
9
Η σταθερή ανάγνωση είναι σημαντική για σύντομα και μεγάλα κείμενα.
Hindi
This section lists text to speech models for Hindi.
vits-piper-hi_IN-pratham-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hi/hi_IN/pratham/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx";
config.model.vits.tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-hi_IN-pratham-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
data_dir="vits-piper-hi_IN-pratham-medium/espeak-ng-data",
tokens="vits-piper-hi_IN-pratham-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx";
config.model.vits.tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx".into()),
tokens: Some("vits-piper-hi_IN-pratham-medium/tokens.txt".into()),
data_dir: Some("vits-piper-hi_IN-pratham-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-hi_IN-pratham-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx',
tokens: 'vits-piper-hi_IN-pratham-medium/tokens.txt',
dataDir: 'vits-piper-hi_IN-pratham-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx',
tokens: 'vits-piper-hi_IN-pratham-medium/tokens.txt',
dataDir: 'vits-piper-hi_IN-pratham-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
lexicon: "",
tokens: "vits-piper-hi_IN-pratham-medium/tokens.txt",
dataDir: "vits-piper-hi_IN-pratham-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
tokens = "vits-piper-hi_IN-pratham-medium/tokens.txt",
dataDir = "vits-piper-hi_IN-pratham-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx");
vits.setTokens("vits-piper-hi_IN-pratham-medium/tokens.txt");
vits.setDataDir("vits-piper-hi_IN-pratham-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-hi_IN-pratham-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-hi_IN-pratham-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-hi_IN-pratham-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-hi_IN-pratham-medium/hi_IN-pratham-medium.onnx",
Tokens: "vits-piper-hi_IN-pratham-medium/tokens.txt",
DataDir: "vits-piper-hi_IN-pratham-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।
sample audios for different speakers are listed below:
Speaker 0
vits-piper-hi_IN-priyamvada-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hi/hi_IN/priyamvada/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx";
config.model.vits.tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
data_dir="vits-piper-hi_IN-priyamvada-medium/espeak-ng-data",
tokens="vits-piper-hi_IN-priyamvada-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx";
config.model.vits.tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx".into()),
tokens: Some("vits-piper-hi_IN-priyamvada-medium/tokens.txt".into()),
data_dir: Some("vits-piper-hi_IN-priyamvada-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx',
tokens: 'vits-piper-hi_IN-priyamvada-medium/tokens.txt',
dataDir: 'vits-piper-hi_IN-priyamvada-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx',
tokens: 'vits-piper-hi_IN-priyamvada-medium/tokens.txt',
dataDir: 'vits-piper-hi_IN-priyamvada-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
lexicon: "",
tokens: "vits-piper-hi_IN-priyamvada-medium/tokens.txt",
dataDir: "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
tokens = "vits-piper-hi_IN-priyamvada-medium/tokens.txt",
dataDir = "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx");
vits.setTokens("vits-piper-hi_IN-priyamvada-medium/tokens.txt");
vits.setDataDir("vits-piper-hi_IN-priyamvada-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-hi_IN-priyamvada-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-hi_IN-priyamvada-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-hi_IN-priyamvada-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-hi_IN-priyamvada-medium/hi_IN-priyamvada-medium.onnx",
Tokens: "vits-piper-hi_IN-priyamvada-medium/tokens.txt",
DataDir: "vits-piper-hi_IN-priyamvada-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।
sample audios for different speakers are listed below:
Speaker 0
vits-piper-hi_IN-rohan-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hi/hi_IN/rohan/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx";
config.model.vits.tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-hi_IN-rohan-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
data_dir="vits-piper-hi_IN-rohan-medium/espeak-ng-data",
tokens="vits-piper-hi_IN-rohan-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx";
config.model.vits.tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx".into()),
tokens: Some("vits-piper-hi_IN-rohan-medium/tokens.txt".into()),
data_dir: Some("vits-piper-hi_IN-rohan-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-hi_IN-rohan-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx',
tokens: 'vits-piper-hi_IN-rohan-medium/tokens.txt',
dataDir: 'vits-piper-hi_IN-rohan-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx',
tokens: 'vits-piper-hi_IN-rohan-medium/tokens.txt',
dataDir: 'vits-piper-hi_IN-rohan-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
lexicon: "",
tokens: "vits-piper-hi_IN-rohan-medium/tokens.txt",
dataDir: "vits-piper-hi_IN-rohan-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
tokens = "vits-piper-hi_IN-rohan-medium/tokens.txt",
dataDir = "vits-piper-hi_IN-rohan-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx");
vits.setTokens("vits-piper-hi_IN-rohan-medium/tokens.txt");
vits.setDataDir("vits-piper-hi_IN-rohan-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-hi_IN-rohan-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-hi_IN-rohan-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-hi_IN-rohan-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-hi_IN-rohan-medium/hi_IN-rohan-medium.onnx",
Tokens: "vits-piper-hi_IN-rohan-medium/tokens.txt",
DataDir: "vits-piper-hi_IN-rohan-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-hi
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Hindi (hi).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "hi"
audio = tts.generate("यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"hi\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "hi"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "hi"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'hi'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'hi'},
);
final audio = tts.generateWithConfig(text: 'यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "hi"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"hi\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "hi"),
)
val audio = tts.generateWithConfigAndCallback(
text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"hi\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "hi"}';
Audio := Tts.GenerateWithConfig('यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "यह अगली पीढ़ी के काल्डी का उपयोग करने वाला एक टेक्स्ट-टू-स्पीच इंजन है"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "hi"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 1
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 2
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 3
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 4
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 5
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 6
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 7
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 8
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Speaker 9
0
नमस्ते दुनिया.
1
आज आप कैसे हैं?
2
आसमान नीला है और हवा हल्की है.
3
मशीन लर्निंग कंप्यूटरों को डेटा से सीखने में मदद करती है.
4
वाक् संश्लेषण पाठ को स्पष्ट ध्वनि में बदलता है.
5
छात्रों ने पुस्तकालय में एक छोटी कहानी पढ़ी.
6
पटरियों की मरम्मत के कारण ट्रेन थोड़ी देर से आई.
7
छोटे मॉडल स्थानीय उपकरणों पर तेज़ी से चलते हैं.
8
वॉयस असिस्टेंट रोज़मर्रा के कामों में मदद करता है.
9
लंबे और छोटे वाक्यों के लिए स्थिर पढ़ना महत्वपूर्ण है.
Hungarian
This section lists text to speech models for Hungarian.
vits-piper-hu_HU-anna-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hu/hu_HU/anna/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx";
config.model.vits.tokens = "vits-piper-hu_HU-anna-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hu_HU-anna-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-hu_HU-anna-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
data_dir="vits-piper-hu_HU-anna-medium/espeak-ng-data",
tokens="vits-piper-hu_HU-anna-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ha északról fúj a szél, a lányok nem lógnak együtt.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx";
config.model.vits.tokens = "vits-piper-hu_HU-anna-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hu_HU-anna-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx".into()),
tokens: Some("vits-piper-hu_HU-anna-medium/tokens.txt".into()),
data_dir: Some("vits-piper-hu_HU-anna-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-hu_HU-anna-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx',
tokens: 'vits-piper-hu_HU-anna-medium/tokens.txt',
dataDir: 'vits-piper-hu_HU-anna-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ha északról fúj a szél, a lányok nem lógnak együtt.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx',
tokens: 'vits-piper-hu_HU-anna-medium/tokens.txt',
dataDir: 'vits-piper-hu_HU-anna-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ha északról fúj a szél, a lányok nem lógnak együtt.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
lexicon: "",
tokens: "vits-piper-hu_HU-anna-medium/tokens.txt",
dataDir: "vits-piper-hu_HU-anna-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ha északról fúj a szél, a lányok nem lógnak együtt."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hu_HU-anna-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hu_HU-anna-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
tokens = "vits-piper-hu_HU-anna-medium/tokens.txt",
dataDir = "vits-piper-hu_HU-anna-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ha északról fúj a szél, a lányok nem lógnak együtt.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx");
vits.setTokens("vits-piper-hu_HU-anna-medium/tokens.txt");
vits.setDataDir("vits-piper-hu_HU-anna-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-hu_HU-anna-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-hu_HU-anna-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ha északról fúj a szél, a lányok nem lógnak együtt.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-hu_HU-anna-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-hu_HU-anna-medium/hu_HU-anna-medium.onnx",
Tokens: "vits-piper-hu_HU-anna-medium/tokens.txt",
DataDir: "vits-piper-hu_HU-anna-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ha északról fúj a szél, a lányok nem lógnak együtt."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ha északról fúj a szél, a lányok nem lógnak együtt.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-hu_HU-berta-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hu/hu_HU/berta/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx";
config.model.vits.tokens = "vits-piper-hu_HU-berta-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hu_HU-berta-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-hu_HU-berta-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
data_dir="vits-piper-hu_HU-berta-medium/espeak-ng-data",
tokens="vits-piper-hu_HU-berta-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ha északról fúj a szél, a lányok nem lógnak együtt.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx";
config.model.vits.tokens = "vits-piper-hu_HU-berta-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hu_HU-berta-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx".into()),
tokens: Some("vits-piper-hu_HU-berta-medium/tokens.txt".into()),
data_dir: Some("vits-piper-hu_HU-berta-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-hu_HU-berta-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx',
tokens: 'vits-piper-hu_HU-berta-medium/tokens.txt',
dataDir: 'vits-piper-hu_HU-berta-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ha északról fúj a szél, a lányok nem lógnak együtt.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx',
tokens: 'vits-piper-hu_HU-berta-medium/tokens.txt',
dataDir: 'vits-piper-hu_HU-berta-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ha északról fúj a szél, a lányok nem lógnak együtt.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
lexicon: "",
tokens: "vits-piper-hu_HU-berta-medium/tokens.txt",
dataDir: "vits-piper-hu_HU-berta-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ha északról fúj a szél, a lányok nem lógnak együtt."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hu_HU-berta-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hu_HU-berta-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
tokens = "vits-piper-hu_HU-berta-medium/tokens.txt",
dataDir = "vits-piper-hu_HU-berta-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ha északról fúj a szél, a lányok nem lógnak együtt.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx");
vits.setTokens("vits-piper-hu_HU-berta-medium/tokens.txt");
vits.setDataDir("vits-piper-hu_HU-berta-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-hu_HU-berta-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-hu_HU-berta-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ha északról fúj a szél, a lányok nem lógnak együtt.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-hu_HU-berta-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-hu_HU-berta-medium/hu_HU-berta-medium.onnx",
Tokens: "vits-piper-hu_HU-berta-medium/tokens.txt",
DataDir: "vits-piper-hu_HU-berta-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ha északról fúj a szél, a lányok nem lógnak együtt."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ha északról fúj a szél, a lányok nem lógnak együtt.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-hu_HU-imre-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/hu/hu_HU/imre/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx";
config.model.vits.tokens = "vits-piper-hu_HU-imre-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hu_HU-imre-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-hu_HU-imre-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
data_dir="vits-piper-hu_HU-imre-medium/espeak-ng-data",
tokens="vits-piper-hu_HU-imre-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ha északról fúj a szél, a lányok nem lógnak együtt.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx";
config.model.vits.tokens = "vits-piper-hu_HU-imre-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-hu_HU-imre-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx".into()),
tokens: Some("vits-piper-hu_HU-imre-medium/tokens.txt".into()),
data_dir: Some("vits-piper-hu_HU-imre-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-hu_HU-imre-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx',
tokens: 'vits-piper-hu_HU-imre-medium/tokens.txt',
dataDir: 'vits-piper-hu_HU-imre-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ha északról fúj a szél, a lányok nem lógnak együtt.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx',
tokens: 'vits-piper-hu_HU-imre-medium/tokens.txt',
dataDir: 'vits-piper-hu_HU-imre-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ha északról fúj a szél, a lányok nem lógnak együtt.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
lexicon: "",
tokens: "vits-piper-hu_HU-imre-medium/tokens.txt",
dataDir: "vits-piper-hu_HU-imre-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ha északról fúj a szél, a lányok nem lógnak együtt."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-hu_HU-imre-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-hu_HU-imre-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
tokens = "vits-piper-hu_HU-imre-medium/tokens.txt",
dataDir = "vits-piper-hu_HU-imre-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ha északról fúj a szél, a lányok nem lógnak együtt.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx");
vits.setTokens("vits-piper-hu_HU-imre-medium/tokens.txt");
vits.setDataDir("vits-piper-hu_HU-imre-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ha északról fúj a szél, a lányok nem lógnak együtt.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-hu_HU-imre-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-hu_HU-imre-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ha északról fúj a szél, a lányok nem lógnak együtt.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-hu_HU-imre-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-hu_HU-imre-medium/hu_HU-imre-medium.onnx",
Tokens: "vits-piper-hu_HU-imre-medium/tokens.txt",
DataDir: "vits-piper-hu_HU-imre-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ha északról fúj a szél, a lányok nem lógnak együtt."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ha északról fúj a szél, a lányok nem lógnak együtt.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-hu
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Hungarian (hu).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "hu"
audio = tts.generate("Ez egy szövegfelolvasó motor a következő generációs kaldi használatával", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"hu\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "hu"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "hu"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ez egy szövegfelolvasó motor a következő generációs kaldi használatával';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'hu'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'hu'},
);
final audio = tts.generateWithConfig(text: 'Ez egy szövegfelolvasó motor a következő generációs kaldi használatával', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "hu"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"hu\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "hu"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"hu\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "hu"}';
Audio := Tts.GenerateWithConfig('Ez egy szövegfelolvasó motor a következő generációs kaldi használatával', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ez egy szövegfelolvasó motor a következő generációs kaldi használatával"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "hu"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 1
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 2
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 3
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 4
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 5
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 6
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 7
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 8
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Speaker 9
0
Helló világ.
1
Hogy vagy ma?
2
Az ég kék, a szél pedig enyhe.
3
A gépi tanulás segít a számítógépeknek adatokból tanulni.
4
A beszédszintézis a szöveget tiszta hanggá alakítja.
5
A diákok rövid történetet olvastak a könyvtárban.
6
A vonat a pálya karbantartása miatt késett.
7
A kis modellek gyorsan futnak helyi eszközökön.
8
A hangasszisztens segít a mindennapi feladatokban.
9
A stabil felolvasás fontos rövid és hosszú mondatoknál is.
Icelandic
This section lists text to speech models for Icelandic.
- vits-piper-is_IS-bui-medium
- vits-piper-is_IS-salka-medium
- vits-piper-is_IS-steinn-medium
- vits-piper-is_IS-ugla-medium
vits-piper-is_IS-bui-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/bui/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-bui-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-bui-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Farðu með allt, eða farðu ekki.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-is_IS-bui-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
data_dir="vits-piper-is_IS-bui-medium/espeak-ng-data",
tokens="vits-piper-is_IS-bui-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-bui-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-bui-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Farðu með allt, eða farðu ekki.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx".into()),
tokens: Some("vits-piper-is_IS-bui-medium/tokens.txt".into()),
data_dir: Some("vits-piper-is_IS-bui-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Farðu með allt, eða farðu ekki.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-is_IS-bui-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx',
tokens: 'vits-piper-is_IS-bui-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-bui-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Farðu með allt, eða farðu ekki.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx',
tokens: 'vits-piper-is_IS-bui-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-bui-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
lexicon: "",
tokens: "vits-piper-is_IS-bui-medium/tokens.txt",
dataDir: "vits-piper-is_IS-bui-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Farðu með allt, eða farðu ekki."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-bui-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-bui-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
tokens = "vits-piper-is_IS-bui-medium/tokens.txt",
dataDir = "vits-piper-is_IS-bui-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Farðu með allt, eða farðu ekki.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx");
vits.setTokens("vits-piper-is_IS-bui-medium/tokens.txt");
vits.setDataDir("vits-piper-is_IS-bui-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-is_IS-bui-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-is_IS-bui-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-is_IS-bui-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-is_IS-bui-medium/is_IS-bui-medium.onnx",
Tokens: "vits-piper-is_IS-bui-medium/tokens.txt",
DataDir: "vits-piper-is_IS-bui-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Farðu með allt, eða farðu ekki."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Farðu með allt, eða farðu ekki.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-is_IS-salka-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/salka/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-salka-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-salka-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Farðu með allt, eða farðu ekki.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-is_IS-salka-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
data_dir="vits-piper-is_IS-salka-medium/espeak-ng-data",
tokens="vits-piper-is_IS-salka-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-salka-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-salka-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Farðu með allt, eða farðu ekki.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx".into()),
tokens: Some("vits-piper-is_IS-salka-medium/tokens.txt".into()),
data_dir: Some("vits-piper-is_IS-salka-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Farðu með allt, eða farðu ekki.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-is_IS-salka-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx',
tokens: 'vits-piper-is_IS-salka-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-salka-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Farðu með allt, eða farðu ekki.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx',
tokens: 'vits-piper-is_IS-salka-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-salka-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
lexicon: "",
tokens: "vits-piper-is_IS-salka-medium/tokens.txt",
dataDir: "vits-piper-is_IS-salka-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Farðu með allt, eða farðu ekki."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-salka-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-salka-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
tokens = "vits-piper-is_IS-salka-medium/tokens.txt",
dataDir = "vits-piper-is_IS-salka-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Farðu með allt, eða farðu ekki.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx");
vits.setTokens("vits-piper-is_IS-salka-medium/tokens.txt");
vits.setDataDir("vits-piper-is_IS-salka-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-is_IS-salka-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-is_IS-salka-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-is_IS-salka-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-is_IS-salka-medium/is_IS-salka-medium.onnx",
Tokens: "vits-piper-is_IS-salka-medium/tokens.txt",
DataDir: "vits-piper-is_IS-salka-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Farðu með allt, eða farðu ekki."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Farðu með allt, eða farðu ekki.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-is_IS-steinn-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/steinn/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-steinn-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-steinn-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Farðu með allt, eða farðu ekki.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-is_IS-steinn-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
data_dir="vits-piper-is_IS-steinn-medium/espeak-ng-data",
tokens="vits-piper-is_IS-steinn-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-steinn-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-steinn-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Farðu með allt, eða farðu ekki.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx".into()),
tokens: Some("vits-piper-is_IS-steinn-medium/tokens.txt".into()),
data_dir: Some("vits-piper-is_IS-steinn-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Farðu með allt, eða farðu ekki.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-is_IS-steinn-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx',
tokens: 'vits-piper-is_IS-steinn-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-steinn-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Farðu með allt, eða farðu ekki.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx',
tokens: 'vits-piper-is_IS-steinn-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-steinn-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
lexicon: "",
tokens: "vits-piper-is_IS-steinn-medium/tokens.txt",
dataDir: "vits-piper-is_IS-steinn-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Farðu með allt, eða farðu ekki."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-steinn-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-steinn-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
tokens = "vits-piper-is_IS-steinn-medium/tokens.txt",
dataDir = "vits-piper-is_IS-steinn-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Farðu með allt, eða farðu ekki.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx");
vits.setTokens("vits-piper-is_IS-steinn-medium/tokens.txt");
vits.setDataDir("vits-piper-is_IS-steinn-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-is_IS-steinn-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-is_IS-steinn-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-is_IS-steinn-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-is_IS-steinn-medium/is_IS-steinn-medium.onnx",
Tokens: "vits-piper-is_IS-steinn-medium/tokens.txt",
DataDir: "vits-piper-is_IS-steinn-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Farðu með allt, eða farðu ekki."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Farðu með allt, eða farðu ekki.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-is_IS-ugla-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/is/is_IS/ugla/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-ugla-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-ugla-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Farðu með allt, eða farðu ekki.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-is_IS-ugla-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
data_dir="vits-piper-is_IS-ugla-medium/espeak-ng-data",
tokens="vits-piper-is_IS-ugla-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Farðu með allt, eða farðu ekki.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx";
config.model.vits.tokens = "vits-piper-is_IS-ugla-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-is_IS-ugla-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Farðu með allt, eða farðu ekki.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx".into()),
tokens: Some("vits-piper-is_IS-ugla-medium/tokens.txt".into()),
data_dir: Some("vits-piper-is_IS-ugla-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Farðu með allt, eða farðu ekki.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-is_IS-ugla-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx',
tokens: 'vits-piper-is_IS-ugla-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-ugla-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Farðu með allt, eða farðu ekki.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx',
tokens: 'vits-piper-is_IS-ugla-medium/tokens.txt',
dataDir: 'vits-piper-is_IS-ugla-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Farðu með allt, eða farðu ekki.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
lexicon: "",
tokens: "vits-piper-is_IS-ugla-medium/tokens.txt",
dataDir: "vits-piper-is_IS-ugla-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Farðu með allt, eða farðu ekki."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-is_IS-ugla-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-is_IS-ugla-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
tokens = "vits-piper-is_IS-ugla-medium/tokens.txt",
dataDir = "vits-piper-is_IS-ugla-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Farðu með allt, eða farðu ekki.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx");
vits.setTokens("vits-piper-is_IS-ugla-medium/tokens.txt");
vits.setDataDir("vits-piper-is_IS-ugla-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Farðu með allt, eða farðu ekki.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-is_IS-ugla-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-is_IS-ugla-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Farðu með allt, eða farðu ekki.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-is_IS-ugla-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-is_IS-ugla-medium/is_IS-ugla-medium.onnx",
Tokens: "vits-piper-is_IS-ugla-medium/tokens.txt",
DataDir: "vits-piper-is_IS-ugla-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Farðu með allt, eða farðu ekki."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Farðu með allt, eða farðu ekki.
sample audios for different speakers are listed below:
Speaker 0
Indonesian
This section lists text to speech models for Indonesian.
vits-piper-id_ID-news_tts-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/id/id_ID/news_tts/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx";
config.model.vits.tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-id_ID-news_tts-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
data_dir="vits-piper-id_ID-news_tts-medium/espeak-ng-data",
tokens="vits-piper-id_ID-news_tts-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx";
config.model.vits.tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx".into()),
tokens: Some("vits-piper-id_ID-news_tts-medium/tokens.txt".into()),
data_dir: Some("vits-piper-id_ID-news_tts-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-id_ID-news_tts-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx',
tokens: 'vits-piper-id_ID-news_tts-medium/tokens.txt',
dataDir: 'vits-piper-id_ID-news_tts-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx',
tokens: 'vits-piper-id_ID-news_tts-medium/tokens.txt',
dataDir: 'vits-piper-id_ID-news_tts-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
lexicon: "",
tokens: "vits-piper-id_ID-news_tts-medium/tokens.txt",
dataDir: "vits-piper-id_ID-news_tts-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
tokens = "vits-piper-id_ID-news_tts-medium/tokens.txt",
dataDir = "vits-piper-id_ID-news_tts-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx");
vits.setTokens("vits-piper-id_ID-news_tts-medium/tokens.txt");
vits.setDataDir("vits-piper-id_ID-news_tts-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-id_ID-news_tts-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-id_ID-news_tts-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-id_ID-news_tts-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-id_ID-news_tts-medium/id_ID-news_tts-medium.onnx",
Tokens: "vits-piper-id_ID-news_tts-medium/tokens.txt",
DataDir: "vits-piper-id_ID-news_tts-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-id
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Indonesian (id).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "id"
audio = tts.generate("Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"id\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "id"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "id"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'id'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'id'},
);
final audio = tts.generateWithConfig(text: 'Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "id"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"id\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "id"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"id\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "id"}';
Audio := Tts.GenerateWithConfig('Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ini adalah mesin text-to-speech yang menggunakan Kaldi generasi berikutnya"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "id"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 1
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 2
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 3
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 4
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 5
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 6
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 7
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 8
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Speaker 9
0
Halo dunia.
1
Apa kabar hari ini?
2
Langit berwarna biru dan angin terasa lembut.
3
Pembelajaran mesin membantu komputer belajar dari data.
4
Sintesis ucapan mengubah teks menjadi suara yang jelas.
5
Para siswa membaca cerita pendek di perpustakaan.
6
Kereta terlambat karena perawatan rel.
7
Model kecil berjalan cepat di perangkat lokal.
8
Asisten suara membantu pekerjaan sehari-hari.
9
Pembacaan yang stabil penting untuk kalimat pendek dan panjang.
Italian
This section lists text to speech models for Italian.
vits-piper-it_IT-dii-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_it-IT_dii
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-dii-high.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx";
config.model.vits.tokens = "vits-piper-it_IT-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-it_IT-dii-high.tar.bz2
You can use the following code to play with vits-piper-it_IT-dii-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
data_dir="vits-piper-it_IT-dii-high/espeak-ng-data",
tokens="vits-piper-it_IT-dii-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx";
config.model.vits.tokens = "vits-piper-it_IT-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-it_IT-dii-high/it_IT-dii-high.onnx".into()),
tokens: Some("vits-piper-it_IT-dii-high/tokens.txt".into()),
data_dir: Some("vits-piper-it_IT-dii-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-it_IT-dii-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-it_IT-dii-high/it_IT-dii-high.onnx',
tokens: 'vits-piper-it_IT-dii-high/tokens.txt',
dataDir: 'vits-piper-it_IT-dii-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-it_IT-dii-high/it_IT-dii-high.onnx',
tokens: 'vits-piper-it_IT-dii-high/tokens.txt',
dataDir: 'vits-piper-it_IT-dii-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
lexicon: "",
tokens: "vits-piper-it_IT-dii-high/tokens.txt",
dataDir: "vits-piper-it_IT-dii-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
tokens = "vits-piper-it_IT-dii-high/tokens.txt",
dataDir = "vits-piper-it_IT-dii-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-it_IT-dii-high/it_IT-dii-high.onnx");
vits.setTokens("vits-piper-it_IT-dii-high/tokens.txt");
vits.setDataDir("vits-piper-it_IT-dii-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-it_IT-dii-high/it_IT-dii-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-it_IT-dii-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-it_IT-dii-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-it_IT-dii-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-it_IT-dii-high/it_IT-dii-high.onnx",
Tokens: "vits-piper-it_IT-dii-high/tokens.txt",
DataDir: "vits-piper-it_IT-dii-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-it_IT-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_it-IT_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx";
config.model.vits.tokens = "vits-piper-it_IT-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-it_IT-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
data_dir="vits-piper-it_IT-miro-high/espeak-ng-data",
tokens="vits-piper-it_IT-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx";
config.model.vits.tokens = "vits-piper-it_IT-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-it_IT-miro-high/it_IT-miro-high.onnx".into()),
tokens: Some("vits-piper-it_IT-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-it_IT-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-it_IT-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-it_IT-miro-high/it_IT-miro-high.onnx',
tokens: 'vits-piper-it_IT-miro-high/tokens.txt',
dataDir: 'vits-piper-it_IT-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-it_IT-miro-high/it_IT-miro-high.onnx',
tokens: 'vits-piper-it_IT-miro-high/tokens.txt',
dataDir: 'vits-piper-it_IT-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-it_IT-miro-high/tokens.txt",
dataDir: "vits-piper-it_IT-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
tokens = "vits-piper-it_IT-miro-high/tokens.txt",
dataDir = "vits-piper-it_IT-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-it_IT-miro-high/it_IT-miro-high.onnx");
vits.setTokens("vits-piper-it_IT-miro-high/tokens.txt");
vits.setDataDir("vits-piper-it_IT-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-it_IT-miro-high/it_IT-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-it_IT-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-it_IT-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-it_IT-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-it_IT-miro-high/it_IT-miro-high.onnx",
Tokens: "vits-piper-it_IT-miro-high/tokens.txt",
DataDir: "vits-piper-it_IT-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-it_IT-paola-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/it/it_IT/paola/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx";
config.model.vits.tokens = "vits-piper-it_IT-paola-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-paola-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-it_IT-paola-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
data_dir="vits-piper-it_IT-paola-medium/espeak-ng-data",
tokens="vits-piper-it_IT-paola-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx";
config.model.vits.tokens = "vits-piper-it_IT-paola-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-paola-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx".into()),
tokens: Some("vits-piper-it_IT-paola-medium/tokens.txt".into()),
data_dir: Some("vits-piper-it_IT-paola-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-it_IT-paola-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx',
tokens: 'vits-piper-it_IT-paola-medium/tokens.txt',
dataDir: 'vits-piper-it_IT-paola-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx',
tokens: 'vits-piper-it_IT-paola-medium/tokens.txt',
dataDir: 'vits-piper-it_IT-paola-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
lexicon: "",
tokens: "vits-piper-it_IT-paola-medium/tokens.txt",
dataDir: "vits-piper-it_IT-paola-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-paola-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-paola-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
tokens = "vits-piper-it_IT-paola-medium/tokens.txt",
dataDir = "vits-piper-it_IT-paola-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx");
vits.setTokens("vits-piper-it_IT-paola-medium/tokens.txt");
vits.setDataDir("vits-piper-it_IT-paola-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-it_IT-paola-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-it_IT-paola-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-it_IT-paola-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-it_IT-paola-medium/it_IT-paola-medium.onnx",
Tokens: "vits-piper-it_IT-paola-medium/tokens.txt",
DataDir: "vits-piper-it_IT-paola-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-it_IT-riccardo-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/it/it_IT/riccardo/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx";
config.model.vits.tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-it_IT-riccardo-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
data_dir="vits-piper-it_IT-riccardo-x_low/espeak-ng-data",
tokens="vits-piper-it_IT-riccardo-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx";
config.model.vits.tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx".into()),
tokens: Some("vits-piper-it_IT-riccardo-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-it_IT-riccardo-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx',
tokens: 'vits-piper-it_IT-riccardo-x_low/tokens.txt',
dataDir: 'vits-piper-it_IT-riccardo-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx',
tokens: 'vits-piper-it_IT-riccardo-x_low/tokens.txt',
dataDir: 'vits-piper-it_IT-riccardo-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
lexicon: "",
tokens: "vits-piper-it_IT-riccardo-x_low/tokens.txt",
dataDir: "vits-piper-it_IT-riccardo-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
tokens = "vits-piper-it_IT-riccardo-x_low/tokens.txt",
dataDir = "vits-piper-it_IT-riccardo-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx");
vits.setTokens("vits-piper-it_IT-riccardo-x_low/tokens.txt");
vits.setDataDir("vits-piper-it_IT-riccardo-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-it_IT-riccardo-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-it_IT-riccardo-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-it_IT-riccardo-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-it_IT-riccardo-x_low/it_IT-riccardo-x_low.onnx",
Tokens: "vits-piper-it_IT-riccardo-x_low/tokens.txt",
DataDir: "vits-piper-it_IT-riccardo-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-it
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Italian (it).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "it"
audio = tts.generate("Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"it\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "it"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "it"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'it'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'it'},
);
final audio = tts.generateWithConfig(text: 'Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "it"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"it\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "it"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"it\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "it"}';
Audio := Tts.GenerateWithConfig('Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "it"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 1
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 2
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 3
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 4
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 5
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 6
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 7
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 8
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Speaker 9
0
Ciao mondo.
1
Come stai oggi?
2
Il cielo è blu e il vento è leggero.
3
L’apprendimento automatico aiuta i computer a imparare dai dati.
4
La sintesi vocale trasforma il testo in audio chiaro.
5
Gli studenti hanno letto una breve storia in biblioteca.
6
Il treno ha subito un ritardo per lavori sui binari.
7
I modelli piccoli funzionano rapidamente sui dispositivi locali.
8
Un assistente vocale aiuta nelle attività quotidiane.
9
Una lettura stabile è importante per frasi brevi e lunghe.
Japanese
This section lists text to speech models for Japanese.
supertonic-3-ja
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Japanese (ja).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ja"
audio = tts.generate("これは次世代のkaldiを使用したテキスト読み上げエンジンです", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"ja\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "ja"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "ja"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'これは次世代のkaldiを使用したテキスト読み上げエンジンです';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'ja'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'ja'},
);
final audio = tts.generateWithConfig(text: 'これは次世代のkaldiを使用したテキスト読み上げエンジンです', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "ja"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ja\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "ja"),
)
val audio = tts.generateWithConfigAndCallback(
text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "これは次世代のkaldiを使用したテキスト読み上げエンジンです";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"ja\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "ja"}';
Audio := Tts.GenerateWithConfig('これは次世代のkaldiを使用したテキスト読み上げエンジンです', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "これは次世代のkaldiを使用したテキスト読み上げエンジンです"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "ja"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 1
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 2
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 3
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 4
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 5
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 6
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 7
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 8
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Speaker 9
0
こんにちは世界。
1
今日はどのように過ごしていますか。
2
空は青く、風は穏やかです。
3
機械学習はデータから学ぶ技術です。
4
音声合成は文章を自然な声に変換します。
5
図書館では多くの人が静かに本を読んでいます。
6
新しい列車の時刻表は来週から使われます。
7
研究者たちは小さな端末で動くモデルを評価しました。
8
音声アシスタントは毎日の作業を手伝います。
9
天気予報によると午後から雨が降るそうです。
Kazakh
This section lists text to speech models for Kazakh.
vits-piper-kk_KZ-iseke-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/kk/kk_KZ/iseke/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx";
config.model.vits.tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Әлемнің жұлдыздары сенің көзің, жаным.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
data_dir="vits-piper-kk_KZ-iseke-x_low/espeak-ng-data",
tokens="vits-piper-kk_KZ-iseke-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Әлемнің жұлдыздары сенің көзің, жаным.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx";
config.model.vits.tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Әлемнің жұлдыздары сенің көзің, жаным.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx".into()),
tokens: Some("vits-piper-kk_KZ-iseke-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-kk_KZ-iseke-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Әлемнің жұлдыздары сенің көзің, жаным.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx',
tokens: 'vits-piper-kk_KZ-iseke-x_low/tokens.txt',
dataDir: 'vits-piper-kk_KZ-iseke-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Әлемнің жұлдыздары сенің көзің, жаным.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx',
tokens: 'vits-piper-kk_KZ-iseke-x_low/tokens.txt',
dataDir: 'vits-piper-kk_KZ-iseke-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Әлемнің жұлдыздары сенің көзің, жаным.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
lexicon: "",
tokens: "vits-piper-kk_KZ-iseke-x_low/tokens.txt",
dataDir: "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Әлемнің жұлдыздары сенің көзің, жаным."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
tokens = "vits-piper-kk_KZ-iseke-x_low/tokens.txt",
dataDir = "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Әлемнің жұлдыздары сенің көзің, жаным.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx");
vits.setTokens("vits-piper-kk_KZ-iseke-x_low/tokens.txt");
vits.setDataDir("vits-piper-kk_KZ-iseke-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-kk_KZ-iseke-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-kk_KZ-iseke-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Әлемнің жұлдыздары сенің көзің, жаным.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-iseke-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-kk_KZ-iseke-x_low/kk_KZ-iseke-x_low.onnx",
Tokens: "vits-piper-kk_KZ-iseke-x_low/tokens.txt",
DataDir: "vits-piper-kk_KZ-iseke-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Әлемнің жұлдыздары сенің көзің, жаным."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Әлемнің жұлдыздары сенің көзің, жаным.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-kk_KZ-issai-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/kk/kk_KZ/issai/high
| Number of speakers | Sample rate |
|---|---|
| 6 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx";
config.model.vits.tokens = "vits-piper-kk_KZ-issai-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-kk_KZ-issai-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Әлемнің жұлдыздары сенің көзің, жаным.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-kk_KZ-issai-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
data_dir="vits-piper-kk_KZ-issai-high/espeak-ng-data",
tokens="vits-piper-kk_KZ-issai-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Әлемнің жұлдыздары сенің көзің, жаным.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx";
config.model.vits.tokens = "vits-piper-kk_KZ-issai-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-kk_KZ-issai-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Әлемнің жұлдыздары сенің көзің, жаным.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx".into()),
tokens: Some("vits-piper-kk_KZ-issai-high/tokens.txt".into()),
data_dir: Some("vits-piper-kk_KZ-issai-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Әлемнің жұлдыздары сенің көзің, жаным.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-kk_KZ-issai-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx',
tokens: 'vits-piper-kk_KZ-issai-high/tokens.txt',
dataDir: 'vits-piper-kk_KZ-issai-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Әлемнің жұлдыздары сенің көзің, жаным.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx',
tokens: 'vits-piper-kk_KZ-issai-high/tokens.txt',
dataDir: 'vits-piper-kk_KZ-issai-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Әлемнің жұлдыздары сенің көзің, жаным.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
lexicon: "",
tokens: "vits-piper-kk_KZ-issai-high/tokens.txt",
dataDir: "vits-piper-kk_KZ-issai-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Әлемнің жұлдыздары сенің көзің, жаным."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx";
config.Model.Vits.Tokens = "vits-piper-kk_KZ-issai-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-kk_KZ-issai-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
tokens = "vits-piper-kk_KZ-issai-high/tokens.txt",
dataDir = "vits-piper-kk_KZ-issai-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Әлемнің жұлдыздары сенің көзің, жаным.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx");
vits.setTokens("vits-piper-kk_KZ-issai-high/tokens.txt");
vits.setDataDir("vits-piper-kk_KZ-issai-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-kk_KZ-issai-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-kk_KZ-issai-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Әлемнің жұлдыздары сенің көзің, жаным.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-issai-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-kk_KZ-issai-high/kk_KZ-issai-high.onnx",
Tokens: "vits-piper-kk_KZ-issai-high/tokens.txt",
DataDir: "vits-piper-kk_KZ-issai-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Әлемнің жұлдыздары сенің көзің, жаным."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Әлемнің жұлдыздары сенің көзің, жаным.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
vits-piper-kk_KZ-raya-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/kk/kk_KZ/raya/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx";
config.model.vits.tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Әлемнің жұлдыздары сенің көзің, жаным.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-kk_KZ-raya-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
data_dir="vits-piper-kk_KZ-raya-x_low/espeak-ng-data",
tokens="vits-piper-kk_KZ-raya-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Әлемнің жұлдыздары сенің көзің, жаным.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx";
config.model.vits.tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Әлемнің жұлдыздары сенің көзің, жаным.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx".into()),
tokens: Some("vits-piper-kk_KZ-raya-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-kk_KZ-raya-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Әлемнің жұлдыздары сенің көзің, жаным.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx',
tokens: 'vits-piper-kk_KZ-raya-x_low/tokens.txt',
dataDir: 'vits-piper-kk_KZ-raya-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Әлемнің жұлдыздары сенің көзің, жаным.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx',
tokens: 'vits-piper-kk_KZ-raya-x_low/tokens.txt',
dataDir: 'vits-piper-kk_KZ-raya-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Әлемнің жұлдыздары сенің көзің, жаным.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
lexicon: "",
tokens: "vits-piper-kk_KZ-raya-x_low/tokens.txt",
dataDir: "vits-piper-kk_KZ-raya-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Әлемнің жұлдыздары сенің көзің, жаным."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
tokens = "vits-piper-kk_KZ-raya-x_low/tokens.txt",
dataDir = "vits-piper-kk_KZ-raya-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Әлемнің жұлдыздары сенің көзің, жаным.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx");
vits.setTokens("vits-piper-kk_KZ-raya-x_low/tokens.txt");
vits.setDataDir("vits-piper-kk_KZ-raya-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Әлемнің жұлдыздары сенің көзің, жаным.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-kk_KZ-raya-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-kk_KZ-raya-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Әлемнің жұлдыздары сенің көзің, жаным.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-kk_KZ-raya-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-kk_KZ-raya-x_low/kk_KZ-raya-x_low.onnx",
Tokens: "vits-piper-kk_KZ-raya-x_low/tokens.txt",
DataDir: "vits-piper-kk_KZ-raya-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Әлемнің жұлдыздары сенің көзің, жаным."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Әлемнің жұлдыздары сенің көзің, жаным.
sample audios for different speakers are listed below:
Speaker 0
Korean
This section lists text to speech models for Korean.
supertonic-3-ko
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Korean (ko).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ko"
audio = tts.generate("이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"ko\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "ko"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "ko"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = '이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'ko'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'ko'},
);
final audio = tts.generateWithConfig(text: '이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "ko"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ko\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "ko"),
)
val audio = tts.generateWithConfigAndCallback(
text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"ko\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "ko"}';
Audio := Tts.GenerateWithConfig('이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "이것은 차세대 kaldi를 사용하는 텍스트 음성 변환 엔진입니다"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "ko"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 1
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 2
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 3
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 4
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 5
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 6
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 7
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 8
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Speaker 9
0
안녕하세요 세계.
1
오늘 어떻게 지내세요?
2
하늘이 푸릅니다.
3
기계학습을 사랑합니다.
4
파이썬은 놀라워요.
5
모든 분께 좋은 아침입니다.
6
인공지능이 성장하고 있습니다.
7
음성 합성은 매력적입니다.
8
신경막은 강력합니다.
9
텍스트 음성 변환이 텍스트를 오디오로 변환합니다.
10
빠른 갈색 여우가 게으른 개를 뛰어넘습니다.
11
기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.
12
자연어 처리가 기계를 이해하도록 돕습니다.
13
딥러닝이 인공지능을 혁신했습니다.
14
음성 합성 기술이 크게 발전했습니다.
15
음성 클로닝이 음성 스타일을 복제할 수 있습니다.
16
텍스트 정규화가 올바른 발음에 중요합니다.
17
음성 비서가 기술과 상호작용하는 데 도움이 됩니다.
18
최신 TTS 시스템이 고품질 음성을 생성합니다.
19
인간 컴퓨터 상호작용이 더 직관적이 되었습니다.
Kurdish
This section lists text to speech models for Kurdish.
vits-piper-ku_TR-berfin_renas-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ku/ku_TR/berfin_renas/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx";
config.model.vits.tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
data_dir="vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data",
tokens="vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx";
config.model.vits.tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx".into()),
tokens: Some("vits-piper-ku_TR-berfin_renas-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx',
tokens: 'vits-piper-ku_TR-berfin_renas-medium/tokens.txt',
dataDir: 'vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx',
tokens: 'vits-piper-ku_TR-berfin_renas-medium/tokens.txt',
dataDir: 'vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
lexicon: "",
tokens: "vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
dataDir: "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
tokens = "vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
dataDir = "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx");
vits.setTokens("vits-piper-ku_TR-berfin_renas-medium/tokens.txt");
vits.setDataDir("vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ku_TR-berfin_renas-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ku_TR-berfin_renas-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ku_TR-berfin_renas-medium/ku_TR-berfin_renas-medium.onnx",
Tokens: "vits-piper-ku_TR-berfin_renas-medium/tokens.txt",
DataDir: "vits-piper-ku_TR-berfin_renas-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ev motorê nivîsandinê bi dengî ye ku Kaldiya serî de bi kar tîne
sample audios for different speakers are listed below:
Speaker 0
Latvian
This section lists text to speech models for Latvian.
vits-piper-lv_LV-aivars-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/lv/lv_LV/aivars/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx";
config.model.vits.tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-lv_LV-aivars-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
data_dir="vits-piper-lv_LV-aivars-medium/espeak-ng-data",
tokens="vits-piper-lv_LV-aivars-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Zeme nenes augļus, ja tēvs sēj, bet māte auž.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx";
config.model.vits.tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx".into()),
tokens: Some("vits-piper-lv_LV-aivars-medium/tokens.txt".into()),
data_dir: Some("vits-piper-lv_LV-aivars-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-lv_LV-aivars-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx',
tokens: 'vits-piper-lv_LV-aivars-medium/tokens.txt',
dataDir: 'vits-piper-lv_LV-aivars-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Zeme nenes augļus, ja tēvs sēj, bet māte auž.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx',
tokens: 'vits-piper-lv_LV-aivars-medium/tokens.txt',
dataDir: 'vits-piper-lv_LV-aivars-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Zeme nenes augļus, ja tēvs sēj, bet māte auž.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
lexicon: "",
tokens: "vits-piper-lv_LV-aivars-medium/tokens.txt",
dataDir: "vits-piper-lv_LV-aivars-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
tokens = "vits-piper-lv_LV-aivars-medium/tokens.txt",
dataDir = "vits-piper-lv_LV-aivars-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx");
vits.setTokens("vits-piper-lv_LV-aivars-medium/tokens.txt");
vits.setDataDir("vits-piper-lv_LV-aivars-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Zeme nenes augļus, ja tēvs sēj, bet māte auž.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-lv_LV-aivars-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-lv_LV-aivars-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Zeme nenes augļus, ja tēvs sēj, bet māte auž.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-lv_LV-aivars-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-lv_LV-aivars-medium/lv_LV-aivars-medium.onnx",
Tokens: "vits-piper-lv_LV-aivars-medium/tokens.txt",
DataDir: "vits-piper-lv_LV-aivars-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Zeme nenes augļus, ja tēvs sēj, bet māte auž."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Zeme nenes augļus, ja tēvs sēj, bet māte auž.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-lv
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Latvian (lv).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "lv"
audio = tts.generate("Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"lv\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "lv"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "lv"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'lv'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'lv'},
);
final audio = tts.generateWithConfig(text: 'Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "lv"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"lv\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "lv"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"lv\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "lv"}';
Audio := Tts.GenerateWithConfig('Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "lv"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 1
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 2
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 3
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 4
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 5
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 6
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 7
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 8
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Speaker 9
0
Sveika pasaule.
1
Kā tev šodien klājas?
2
Debesis ir zilas, un vējš ir maigs.
3
Mašīnmācīšanās palīdz datoriem mācīties no datiem.
4
Runas sintēze pārvērš tekstu skaidrā skaņā.
5
Skolēni bibliotēkā lasīja īsu stāstu.
6
Vilciens kavējās sliežu remonta dēļ.
7
Mazie modeļi ātri darbojas vietējās ierīcēs.
8
Balss asistents palīdz ikdienas uzdevumos.
9
Stabila lasīšana ir svarīga īsiem un gariem teikumiem.
Lithuanian
This section lists text to speech models for Lithuanian.
supertonic-3-lt
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Lithuanian (lt).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "lt"
audio = tts.generate("Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"lt\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "lt"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "lt"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'lt'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'lt'},
);
final audio = tts.generateWithConfig(text: 'Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "lt"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"lt\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "lt"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"lt\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "lt"}';
Audio := Tts.GenerateWithConfig('Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "lt"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 1
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 2
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 3
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 4
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 5
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 6
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 7
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 8
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Speaker 9
0
Labas pasauli.
1
Kaip šiandien laikaisi?
2
Dangus mėlynas, o vėjas švelnus.
3
Mašininis mokymasis padeda kompiuteriams mokytis iš duomenų.
4
Kalbos sintezė paverčia tekstą aiškiu garsu.
5
Mokiniai bibliotekoje perskaitė trumpą istoriją.
6
Traukinys vėlavo dėl bėgių priežiūros.
7
Maži modeliai greitai veikia vietiniuose įrenginiuose.
8
Balso asistentas padeda atlikti kasdienes užduotis.
9
Stabilus skaitymas svarbus trumpiems ir ilgiems sakiniams.
Luxembourgish
This section lists text to speech models for Luxembourgish.
vits-piper-lb_LU-marylux-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/lb/lb_LU/marylux/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx";
config.model.vits.tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-lb_LU-marylux-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
data_dir="vits-piper-lb_LU-marylux-medium/espeak-ng-data",
tokens="vits-piper-lb_LU-marylux-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx";
config.model.vits.tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx".into()),
tokens: Some("vits-piper-lb_LU-marylux-medium/tokens.txt".into()),
data_dir: Some("vits-piper-lb_LU-marylux-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-lb_LU-marylux-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx',
tokens: 'vits-piper-lb_LU-marylux-medium/tokens.txt',
dataDir: 'vits-piper-lb_LU-marylux-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx',
tokens: 'vits-piper-lb_LU-marylux-medium/tokens.txt',
dataDir: 'vits-piper-lb_LU-marylux-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
lexicon: "",
tokens: "vits-piper-lb_LU-marylux-medium/tokens.txt",
dataDir: "vits-piper-lb_LU-marylux-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
tokens = "vits-piper-lb_LU-marylux-medium/tokens.txt",
dataDir = "vits-piper-lb_LU-marylux-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx");
vits.setTokens("vits-piper-lb_LU-marylux-medium/tokens.txt");
vits.setDataDir("vits-piper-lb_LU-marylux-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-lb_LU-marylux-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-lb_LU-marylux-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-lb_LU-marylux-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-lb_LU-marylux-medium/lb_LU-marylux-medium.onnx",
Tokens: "vits-piper-lb_LU-marylux-medium/tokens.txt",
DataDir: "vits-piper-lb_LU-marylux-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.
sample audios for different speakers are listed below:
Speaker 0
Malayalam
This section lists text to speech models for Malayalam.
vits-piper-ml_IN-arjun-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ml/ml_IN/arjun/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx";
config.model.vits.tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ml_IN-arjun-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
data_dir="vits-piper-ml_IN-arjun-medium/espeak-ng-data",
tokens="vits-piper-ml_IN-arjun-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx";
config.model.vits.tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx".into()),
tokens: Some("vits-piper-ml_IN-arjun-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ml_IN-arjun-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ml_IN-arjun-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx',
tokens: 'vits-piper-ml_IN-arjun-medium/tokens.txt',
dataDir: 'vits-piper-ml_IN-arjun-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx',
tokens: 'vits-piper-ml_IN-arjun-medium/tokens.txt',
dataDir: 'vits-piper-ml_IN-arjun-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
lexicon: "",
tokens: "vits-piper-ml_IN-arjun-medium/tokens.txt",
dataDir: "vits-piper-ml_IN-arjun-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
tokens = "vits-piper-ml_IN-arjun-medium/tokens.txt",
dataDir = "vits-piper-ml_IN-arjun-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx");
vits.setTokens("vits-piper-ml_IN-arjun-medium/tokens.txt");
vits.setDataDir("vits-piper-ml_IN-arjun-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ml_IN-arjun-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ml_IN-arjun-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ml_IN-arjun-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ml_IN-arjun-medium/ml_IN-arjun-medium.onnx",
Tokens: "vits-piper-ml_IN-arjun-medium/tokens.txt",
DataDir: "vits-piper-ml_IN-arjun-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ml_IN-meera-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ml/ml_IN/meera/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx";
config.model.vits.tokens = "vits-piper-ml_IN-meera-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ml_IN-meera-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ml_IN-meera-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
data_dir="vits-piper-ml_IN-meera-medium/espeak-ng-data",
tokens="vits-piper-ml_IN-meera-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx";
config.model.vits.tokens = "vits-piper-ml_IN-meera-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ml_IN-meera-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx".into()),
tokens: Some("vits-piper-ml_IN-meera-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ml_IN-meera-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ml_IN-meera-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx',
tokens: 'vits-piper-ml_IN-meera-medium/tokens.txt',
dataDir: 'vits-piper-ml_IN-meera-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx',
tokens: 'vits-piper-ml_IN-meera-medium/tokens.txt',
dataDir: 'vits-piper-ml_IN-meera-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
lexicon: "",
tokens: "vits-piper-ml_IN-meera-medium/tokens.txt",
dataDir: "vits-piper-ml_IN-meera-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ml_IN-meera-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ml_IN-meera-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
tokens = "vits-piper-ml_IN-meera-medium/tokens.txt",
dataDir = "vits-piper-ml_IN-meera-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx");
vits.setTokens("vits-piper-ml_IN-meera-medium/tokens.txt");
vits.setDataDir("vits-piper-ml_IN-meera-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ml_IN-meera-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ml_IN-meera-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ml_IN-meera-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ml_IN-meera-medium/ml_IN-meera-medium.onnx",
Tokens: "vits-piper-ml_IN-meera-medium/tokens.txt",
DataDir: "vits-piper-ml_IN-meera-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.
sample audios for different speakers are listed below:
Speaker 0
Nepali
This section lists text to speech models for Nepali.
vits-piper-ne_NP-chitwan-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ne/ne_NP/chitwan/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx";
config.model.vits.tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ne_NP-chitwan-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
data_dir="vits-piper-ne_NP-chitwan-medium/espeak-ng-data",
tokens="vits-piper-ne_NP-chitwan-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx";
config.model.vits.tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx".into()),
tokens: Some("vits-piper-ne_NP-chitwan-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ne_NP-chitwan-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx',
tokens: 'vits-piper-ne_NP-chitwan-medium/tokens.txt',
dataDir: 'vits-piper-ne_NP-chitwan-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx',
tokens: 'vits-piper-ne_NP-chitwan-medium/tokens.txt',
dataDir: 'vits-piper-ne_NP-chitwan-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
lexicon: "",
tokens: "vits-piper-ne_NP-chitwan-medium/tokens.txt",
dataDir: "vits-piper-ne_NP-chitwan-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
tokens = "vits-piper-ne_NP-chitwan-medium/tokens.txt",
dataDir = "vits-piper-ne_NP-chitwan-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx");
vits.setTokens("vits-piper-ne_NP-chitwan-medium/tokens.txt");
vits.setDataDir("vits-piper-ne_NP-chitwan-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ne_NP-chitwan-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ne_NP-chitwan-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ne_NP-chitwan-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ne_NP-chitwan-medium/ne_NP-chitwan-medium.onnx",
Tokens: "vits-piper-ne_NP-chitwan-medium/tokens.txt",
DataDir: "vits-piper-ne_NP-chitwan-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ne_NP-google-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ne/ne_NP/google/medium
| Number of speakers | Sample rate |
|---|---|
| 18 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx";
config.model.vits.tokens = "vits-piper-ne_NP-google-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ne_NP-google-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ne_NP-google-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
data_dir="vits-piper-ne_NP-google-medium/espeak-ng-data",
tokens="vits-piper-ne_NP-google-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx";
config.model.vits.tokens = "vits-piper-ne_NP-google-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ne_NP-google-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx".into()),
tokens: Some("vits-piper-ne_NP-google-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ne_NP-google-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ne_NP-google-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx',
tokens: 'vits-piper-ne_NP-google-medium/tokens.txt',
dataDir: 'vits-piper-ne_NP-google-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx',
tokens: 'vits-piper-ne_NP-google-medium/tokens.txt',
dataDir: 'vits-piper-ne_NP-google-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
lexicon: "",
tokens: "vits-piper-ne_NP-google-medium/tokens.txt",
dataDir: "vits-piper-ne_NP-google-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ne_NP-google-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ne_NP-google-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
tokens = "vits-piper-ne_NP-google-medium/tokens.txt",
dataDir = "vits-piper-ne_NP-google-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx");
vits.setTokens("vits-piper-ne_NP-google-medium/tokens.txt");
vits.setDataDir("vits-piper-ne_NP-google-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ne_NP-google-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ne_NP-google-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ne_NP-google-medium/ne_NP-google-medium.onnx",
Tokens: "vits-piper-ne_NP-google-medium/tokens.txt",
DataDir: "vits-piper-ne_NP-google-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
vits-piper-ne_NP-google-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ne/ne_NP/google/x_low
| Number of speakers | Sample rate |
|---|---|
| 18 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx";
config.model.vits.tokens = "vits-piper-ne_NP-google-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ne_NP-google-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ne_NP-google-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
data_dir="vits-piper-ne_NP-google-x_low/espeak-ng-data",
tokens="vits-piper-ne_NP-google-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx";
config.model.vits.tokens = "vits-piper-ne_NP-google-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-ne_NP-google-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx".into()),
tokens: Some("vits-piper-ne_NP-google-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-ne_NP-google-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ne_NP-google-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx',
tokens: 'vits-piper-ne_NP-google-x_low/tokens.txt',
dataDir: 'vits-piper-ne_NP-google-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx',
tokens: 'vits-piper-ne_NP-google-x_low/tokens.txt',
dataDir: 'vits-piper-ne_NP-google-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
lexicon: "",
tokens: "vits-piper-ne_NP-google-x_low/tokens.txt",
dataDir: "vits-piper-ne_NP-google-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-ne_NP-google-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ne_NP-google-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
tokens = "vits-piper-ne_NP-google-x_low/tokens.txt",
dataDir = "vits-piper-ne_NP-google-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx");
vits.setTokens("vits-piper-ne_NP-google-x_low/tokens.txt");
vits.setDataDir("vits-piper-ne_NP-google-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ne_NP-google-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ne_NP-google-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ne_NP-google-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ne_NP-google-x_low/ne_NP-google-x_low.onnx",
Tokens: "vits-piper-ne_NP-google-x_low/tokens.txt",
DataDir: "vits-piper-ne_NP-google-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
Norwegian
This section lists text to speech models for Norwegian.
vits-piper-no_NO-talesyntese-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/no/no_NO/talesyntese/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx";
config.model.vits.tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Uskyldig kan stormen veroorzaken";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-no_NO-talesyntese-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
data_dir="vits-piper-no_NO-talesyntese-medium/espeak-ng-data",
tokens="vits-piper-no_NO-talesyntese-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Uskyldig kan stormen veroorzaken",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx";
config.model.vits.tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Uskyldig kan stormen veroorzaken";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx".into()),
tokens: Some("vits-piper-no_NO-talesyntese-medium/tokens.txt".into()),
data_dir: Some("vits-piper-no_NO-talesyntese-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Uskyldig kan stormen veroorzaken";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx',
tokens: 'vits-piper-no_NO-talesyntese-medium/tokens.txt',
dataDir: 'vits-piper-no_NO-talesyntese-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Uskyldig kan stormen veroorzaken';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx',
tokens: 'vits-piper-no_NO-talesyntese-medium/tokens.txt',
dataDir: 'vits-piper-no_NO-talesyntese-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Uskyldig kan stormen veroorzaken', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
lexicon: "",
tokens: "vits-piper-no_NO-talesyntese-medium/tokens.txt",
dataDir: "vits-piper-no_NO-talesyntese-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Uskyldig kan stormen veroorzaken"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Uskyldig kan stormen veroorzaken";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
tokens = "vits-piper-no_NO-talesyntese-medium/tokens.txt",
dataDir = "vits-piper-no_NO-talesyntese-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Uskyldig kan stormen veroorzaken",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx");
vits.setTokens("vits-piper-no_NO-talesyntese-medium/tokens.txt");
vits.setDataDir("vits-piper-no_NO-talesyntese-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Uskyldig kan stormen veroorzaken";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-no_NO-talesyntese-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-no_NO-talesyntese-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Uskyldig kan stormen veroorzaken', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-no_NO-talesyntese-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-no_NO-talesyntese-medium/no_NO-talesyntese-medium.onnx",
Tokens: "vits-piper-no_NO-talesyntese-medium/tokens.txt",
DataDir: "vits-piper-no_NO-talesyntese-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Uskyldig kan stormen veroorzaken"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Uskyldig kan stormen veroorzaken
sample audios for different speakers are listed below:
Speaker 0
Persian
This section lists text to speech models for Persian.
- vits-piper-fa_IR-amir-medium
- vits-piper-fa_IR-ganji-medium
- vits-piper-fa_IR-ganji_adabi-medium
- vits-piper-fa_IR-gyro-medium
- vits-piper-fa_IR-reza_ibrahim-medium
vits-piper-fa_IR-amir-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/amir/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-amir-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-amir-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fa_IR-amir-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
data_dir="vits-piper-fa_IR-amir-medium/espeak-ng-data",
tokens="vits-piper-fa_IR-amir-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-amir-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-amir-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx".into()),
tokens: Some("vits-piper-fa_IR-amir-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fa_IR-amir-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fa_IR-amir-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx',
tokens: 'vits-piper-fa_IR-amir-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-amir-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx',
tokens: 'vits-piper-fa_IR-amir-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-amir-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
lexicon: "",
tokens: "vits-piper-fa_IR-amir-medium/tokens.txt",
dataDir: "vits-piper-fa_IR-amir-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-amir-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-amir-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
tokens = "vits-piper-fa_IR-amir-medium/tokens.txt",
dataDir = "vits-piper-fa_IR-amir-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx");
vits.setTokens("vits-piper-fa_IR-amir-medium/tokens.txt");
vits.setDataDir("vits-piper-fa_IR-amir-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fa_IR-amir-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fa_IR-amir-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fa_IR-amir-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fa_IR-amir-medium/fa_IR-amir-medium.onnx",
Tokens: "vits-piper-fa_IR-amir-medium/tokens.txt",
DataDir: "vits-piper-fa_IR-amir-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fa_IR-ganji-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/ganji/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fa_IR-ganji-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
data_dir="vits-piper-fa_IR-ganji-medium/espeak-ng-data",
tokens="vits-piper-fa_IR-ganji-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx".into()),
tokens: Some("vits-piper-fa_IR-ganji-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fa_IR-ganji-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fa_IR-ganji-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx',
tokens: 'vits-piper-fa_IR-ganji-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-ganji-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx',
tokens: 'vits-piper-fa_IR-ganji-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-ganji-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
lexicon: "",
tokens: "vits-piper-fa_IR-ganji-medium/tokens.txt",
dataDir: "vits-piper-fa_IR-ganji-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
tokens = "vits-piper-fa_IR-ganji-medium/tokens.txt",
dataDir = "vits-piper-fa_IR-ganji-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx");
vits.setTokens("vits-piper-fa_IR-ganji-medium/tokens.txt");
vits.setDataDir("vits-piper-fa_IR-ganji-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fa_IR-ganji-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fa_IR-ganji-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fa_IR-ganji-medium/fa_IR-ganji-medium.onnx",
Tokens: "vits-piper-fa_IR-ganji-medium/tokens.txt",
DataDir: "vits-piper-fa_IR-ganji-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fa_IR-ganji_adabi-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/ganji_adabi/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
data_dir="vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data",
tokens="vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx".into()),
tokens: Some("vits-piper-fa_IR-ganji_adabi-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx',
tokens: 'vits-piper-fa_IR-ganji_adabi-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx',
tokens: 'vits-piper-fa_IR-ganji_adabi-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
lexicon: "",
tokens: "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
dataDir: "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
tokens = "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
dataDir = "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx");
vits.setTokens("vits-piper-fa_IR-ganji_adabi-medium/tokens.txt");
vits.setDataDir("vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fa_IR-ganji_adabi-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fa_IR-ganji_adabi-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fa_IR-ganji_adabi-medium/fa_IR-ganji_adabi-medium.onnx",
Tokens: "vits-piper-fa_IR-ganji_adabi-medium/tokens.txt",
DataDir: "vits-piper-fa_IR-ganji_adabi-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fa_IR-gyro-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/gyro/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fa_IR-gyro-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
data_dir="vits-piper-fa_IR-gyro-medium/espeak-ng-data",
tokens="vits-piper-fa_IR-gyro-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx".into()),
tokens: Some("vits-piper-fa_IR-gyro-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fa_IR-gyro-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fa_IR-gyro-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx',
tokens: 'vits-piper-fa_IR-gyro-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-gyro-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx',
tokens: 'vits-piper-fa_IR-gyro-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-gyro-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
lexicon: "",
tokens: "vits-piper-fa_IR-gyro-medium/tokens.txt",
dataDir: "vits-piper-fa_IR-gyro-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
tokens = "vits-piper-fa_IR-gyro-medium/tokens.txt",
dataDir = "vits-piper-fa_IR-gyro-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx");
vits.setTokens("vits-piper-fa_IR-gyro-medium/tokens.txt");
vits.setDataDir("vits-piper-fa_IR-gyro-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fa_IR-gyro-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fa_IR-gyro-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fa_IR-gyro-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fa_IR-gyro-medium/fa_IR-gyro-medium.onnx",
Tokens: "vits-piper-fa_IR-gyro-medium/tokens.txt",
DataDir: "vits-piper-fa_IR-gyro-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-fa_IR-reza_ibrahim-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/fa/fa_IR/reza_ibrahim/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
data_dir="vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data",
tokens="vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx";
config.model.vits.tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx".into()),
tokens: Some("vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt".into()),
data_dir: Some("vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx',
tokens: 'vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx',
tokens: 'vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt',
dataDir: 'vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
lexicon: "",
tokens: "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
dataDir: "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
tokens = "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
dataDir = "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx");
vits.setTokens("vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt");
vits.setDataDir("vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-fa_IR-reza_ibrahim-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-fa_IR-reza_ibrahim-medium/fa_IR-reza_ibrahim-medium.onnx",
Tokens: "vits-piper-fa_IR-reza_ibrahim-medium/tokens.txt",
DataDir: "vits-piper-fa_IR-reza_ibrahim-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.
sample audios for different speakers are listed below:
Speaker 0
Polish
This section lists text to speech models for Polish.
vits-piper-pl_PL-bass-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/bass/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-bass-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-bass-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-bass-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
data_dir="vits-piper-pl_PL-bass-high/espeak-ng-data",
tokens="vits-piper-pl_PL-bass-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-bass-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-bass-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx".into()),
tokens: Some("vits-piper-pl_PL-bass-high/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-bass-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-bass-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx',
tokens: 'vits-piper-pl_PL-bass-high/tokens.txt',
dataDir: 'vits-piper-pl_PL-bass-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx',
tokens: 'vits-piper-pl_PL-bass-high/tokens.txt',
dataDir: 'vits-piper-pl_PL-bass-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-bass-high/tokens.txt",
dataDir: "vits-piper-pl_PL-bass-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-bass-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-bass-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
tokens = "vits-piper-pl_PL-bass-high/tokens.txt",
dataDir = "vits-piper-pl_PL-bass-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx");
vits.setTokens("vits-piper-pl_PL-bass-high/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-bass-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-bass-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-bass-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-bass-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-bass-high/pl_PL-bass-high.onnx",
Tokens: "vits-piper-pl_PL-bass-high/tokens.txt",
DataDir: "vits-piper-pl_PL-bass-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pl_PL-darkman-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/darkman/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-darkman-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
data_dir="vits-piper-pl_PL-darkman-medium/espeak-ng-data",
tokens="vits-piper-pl_PL-darkman-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx".into()),
tokens: Some("vits-piper-pl_PL-darkman-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-darkman-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-darkman-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx',
tokens: 'vits-piper-pl_PL-darkman-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-darkman-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx',
tokens: 'vits-piper-pl_PL-darkman-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-darkman-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-darkman-medium/tokens.txt",
dataDir: "vits-piper-pl_PL-darkman-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
tokens = "vits-piper-pl_PL-darkman-medium/tokens.txt",
dataDir = "vits-piper-pl_PL-darkman-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx");
vits.setTokens("vits-piper-pl_PL-darkman-medium/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-darkman-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-darkman-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-darkman-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-darkman-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-darkman-medium/pl_PL-darkman-medium.onnx",
Tokens: "vits-piper-pl_PL-darkman-medium/tokens.txt",
DataDir: "vits-piper-pl_PL-darkman-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pl_PL-gosia-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/gosia/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-gosia-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
data_dir="vits-piper-pl_PL-gosia-medium/espeak-ng-data",
tokens="vits-piper-pl_PL-gosia-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx".into()),
tokens: Some("vits-piper-pl_PL-gosia-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-gosia-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-gosia-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx',
tokens: 'vits-piper-pl_PL-gosia-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-gosia-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx',
tokens: 'vits-piper-pl_PL-gosia-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-gosia-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-gosia-medium/tokens.txt",
dataDir: "vits-piper-pl_PL-gosia-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
tokens = "vits-piper-pl_PL-gosia-medium/tokens.txt",
dataDir = "vits-piper-pl_PL-gosia-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx");
vits.setTokens("vits-piper-pl_PL-gosia-medium/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-gosia-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-gosia-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-gosia-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-gosia-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-gosia-medium/pl_PL-gosia-medium.onnx",
Tokens: "vits-piper-pl_PL-gosia-medium/tokens.txt",
DataDir: "vits-piper-pl_PL-gosia-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pl_PL-jarvis_wg_glos-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
data_dir="vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data",
tokens="vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx".into()),
tokens: Some("vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
dataDir: "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
tokens = "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
dataDir = "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx");
vits.setTokens("vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-jarvis_wg_glos-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-jarvis_wg_glos-medium/pl_PL-jarvis_wg_glos-medium.onnx",
Tokens: "vits-piper-pl_PL-jarvis_wg_glos-medium/tokens.txt",
DataDir: "vits-piper-pl_PL-jarvis_wg_glos-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pl_PL-justyna_wg_glos-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
data_dir="vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data",
tokens="vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx".into()),
tokens: Some("vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
dataDir: "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
tokens = "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
dataDir = "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx");
vits.setTokens("vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-justyna_wg_glos-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-justyna_wg_glos-medium/pl_PL-justyna_wg_glos-medium.onnx",
Tokens: "vits-piper-pl_PL-justyna_wg_glos-medium/tokens.txt",
DataDir: "vits-piper-pl_PL-justyna_wg_glos-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pl_PL-mc_speech-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pl/pl_PL/mc_speech/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
data_dir="vits-piper-pl_PL-mc_speech-medium/espeak-ng-data",
tokens="vits-piper-pl_PL-mc_speech-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx".into()),
tokens: Some("vits-piper-pl_PL-mc_speech-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-mc_speech-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx',
tokens: 'vits-piper-pl_PL-mc_speech-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-mc_speech-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx',
tokens: 'vits-piper-pl_PL-mc_speech-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-mc_speech-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-mc_speech-medium/tokens.txt",
dataDir: "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
tokens = "vits-piper-pl_PL-mc_speech-medium/tokens.txt",
dataDir = "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx");
vits.setTokens("vits-piper-pl_PL-mc_speech-medium/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-mc_speech-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-mc_speech-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-mc_speech-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-mc_speech-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-mc_speech-medium/pl_PL-mc_speech-medium.onnx",
Tokens: "vits-piper-pl_PL-mc_speech-medium/tokens.txt",
DataDir: "vits-piper-pl_PL-mc_speech-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pl_PL-meski_wg_glos-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
data_dir="vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data",
tokens="vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx".into()),
tokens: Some("vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
dataDir: "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
tokens = "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
dataDir = "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx");
vits.setTokens("vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-meski_wg_glos-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-meski_wg_glos-medium/pl_PL-meski_wg_glos-medium.onnx",
Tokens: "vits-piper-pl_PL-meski_wg_glos-medium/tokens.txt",
DataDir: "vits-piper-pl_PL-meski_wg_glos-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pl_PL-zenski_wg_glos-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://github.com/k2-fsa/sherpa-onnx/issues/2402
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
data_dir="vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data",
tokens="vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nieważne, za kogo walczysz, i tak popełnisz błąd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx";
config.model.vits.tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx".into()),
tokens: Some("vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nieważne, za kogo walczysz, i tak popełnisz błąd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx',
tokens: 'vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt',
dataDir: 'vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nieważne, za kogo walczysz, i tak popełnisz błąd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
lexicon: "",
tokens: "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
dataDir: "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nieważne, za kogo walczysz, i tak popełnisz błąd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
tokens = "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
dataDir = "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nieważne, za kogo walczysz, i tak popełnisz błąd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx");
vits.setTokens("vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt");
vits.setDataDir("vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nieważne, za kogo walczysz, i tak popełnisz błąd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nieważne, za kogo walczysz, i tak popełnisz błąd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pl_PL-zenski_wg_glos-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pl_PL-zenski_wg_glos-medium/pl_PL-zenski_wg_glos-medium.onnx",
Tokens: "vits-piper-pl_PL-zenski_wg_glos-medium/tokens.txt",
DataDir: "vits-piper-pl_PL-zenski_wg_glos-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nieważne, za kogo walczysz, i tak popełnisz błąd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nieważne, za kogo walczysz, i tak popełnisz błąd
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-pl
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Polish (pl).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "pl"
audio = tts.generate("Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"pl\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "pl"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "pl"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'pl'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'pl'},
);
final audio = tts.generateWithConfig(text: 'Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "pl"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"pl\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "pl"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"pl\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "pl"}';
Audio := Tts.GenerateWithConfig('Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "pl"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 1
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 2
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 3
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 4
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 5
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 6
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 7
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 8
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Speaker 9
0
Witaj świecie.
1
Jak się dziś masz?
2
Niebo jest niebieskie, a wiatr jest łagodny.
3
Uczenie maszynowe pomaga komputerom uczyć się z danych.
4
Synteza mowy zamienia tekst w wyraźny dźwięk.
5
Uczniowie przeczytali krótką historię w bibliotece.
6
Pociąg spóźnił się z powodu konserwacji torów.
7
Małe modele działają szybko na lokalnych urządzeniach.
8
Asystent głosowy pomaga w codziennych zadaniach.
9
Stabilne czytanie jest ważne dla krótkich i długich zdań.
Portuguese
This section lists text to speech models for Portuguese.
vits-piper-pt_BR-cadu-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/cadu/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pt_BR-cadu-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
data_dir="vits-piper-pt_BR-cadu-medium/espeak-ng-data",
tokens="vits-piper-pt_BR-cadu-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx".into()),
tokens: Some("vits-piper-pt_BR-cadu-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pt_BR-cadu-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_BR-cadu-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx',
tokens: 'vits-piper-pt_BR-cadu-medium/tokens.txt',
dataDir: 'vits-piper-pt_BR-cadu-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx',
tokens: 'vits-piper-pt_BR-cadu-medium/tokens.txt',
dataDir: 'vits-piper-pt_BR-cadu-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
lexicon: "",
tokens: "vits-piper-pt_BR-cadu-medium/tokens.txt",
dataDir: "vits-piper-pt_BR-cadu-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
tokens = "vits-piper-pt_BR-cadu-medium/tokens.txt",
dataDir = "vits-piper-pt_BR-cadu-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx");
vits.setTokens("vits-piper-pt_BR-cadu-medium/tokens.txt");
vits.setDataDir("vits-piper-pt_BR-cadu-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_BR-cadu-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_BR-cadu-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_BR-cadu-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_BR-cadu-medium/pt_BR-cadu-medium.onnx",
Tokens: "vits-piper-pt_BR-cadu-medium/tokens.txt",
DataDir: "vits-piper-pt_BR-cadu-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_BR-dii-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_dii
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-dii-high.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_BR-dii-high.tar.bz2
You can use the following code to play with vits-piper-pt_BR-dii-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
data_dir="vits-piper-pt_BR-dii-high/espeak-ng-data",
tokens="vits-piper-pt_BR-dii-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx".into()),
tokens: Some("vits-piper-pt_BR-dii-high/tokens.txt".into()),
data_dir: Some("vits-piper-pt_BR-dii-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_BR-dii-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx',
tokens: 'vits-piper-pt_BR-dii-high/tokens.txt',
dataDir: 'vits-piper-pt_BR-dii-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx',
tokens: 'vits-piper-pt_BR-dii-high/tokens.txt',
dataDir: 'vits-piper-pt_BR-dii-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
lexicon: "",
tokens: "vits-piper-pt_BR-dii-high/tokens.txt",
dataDir: "vits-piper-pt_BR-dii-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
tokens = "vits-piper-pt_BR-dii-high/tokens.txt",
dataDir = "vits-piper-pt_BR-dii-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx");
vits.setTokens("vits-piper-pt_BR-dii-high/tokens.txt");
vits.setDataDir("vits-piper-pt_BR-dii-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_BR-dii-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_BR-dii-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_BR-dii-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_BR-dii-high/pt_BR-dii-high.onnx",
Tokens: "vits-piper-pt_BR-dii-high/tokens.txt",
DataDir: "vits-piper-pt_BR-dii-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_BR-edresson-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/edresson/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-edresson-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-edresson-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pt_BR-edresson-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
data_dir="vits-piper-pt_BR-edresson-low/espeak-ng-data",
tokens="vits-piper-pt_BR-edresson-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-edresson-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-edresson-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx".into()),
tokens: Some("vits-piper-pt_BR-edresson-low/tokens.txt".into()),
data_dir: Some("vits-piper-pt_BR-edresson-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_BR-edresson-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx',
tokens: 'vits-piper-pt_BR-edresson-low/tokens.txt',
dataDir: 'vits-piper-pt_BR-edresson-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx',
tokens: 'vits-piper-pt_BR-edresson-low/tokens.txt',
dataDir: 'vits-piper-pt_BR-edresson-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
lexicon: "",
tokens: "vits-piper-pt_BR-edresson-low/tokens.txt",
dataDir: "vits-piper-pt_BR-edresson-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-edresson-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-edresson-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
tokens = "vits-piper-pt_BR-edresson-low/tokens.txt",
dataDir = "vits-piper-pt_BR-edresson-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx");
vits.setTokens("vits-piper-pt_BR-edresson-low/tokens.txt");
vits.setDataDir("vits-piper-pt_BR-edresson-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_BR-edresson-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_BR-edresson-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_BR-edresson-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_BR-edresson-low/pt_BR-edresson-low.onnx",
Tokens: "vits-piper-pt_BR-edresson-low/tokens.txt",
DataDir: "vits-piper-pt_BR-edresson-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_BR-faber-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/faber/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-faber-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-faber-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pt_BR-faber-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
data_dir="vits-piper-pt_BR-faber-medium/espeak-ng-data",
tokens="vits-piper-pt_BR-faber-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-faber-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-faber-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx".into()),
tokens: Some("vits-piper-pt_BR-faber-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pt_BR-faber-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_BR-faber-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx',
tokens: 'vits-piper-pt_BR-faber-medium/tokens.txt',
dataDir: 'vits-piper-pt_BR-faber-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx',
tokens: 'vits-piper-pt_BR-faber-medium/tokens.txt',
dataDir: 'vits-piper-pt_BR-faber-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
lexicon: "",
tokens: "vits-piper-pt_BR-faber-medium/tokens.txt",
dataDir: "vits-piper-pt_BR-faber-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-faber-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-faber-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
tokens = "vits-piper-pt_BR-faber-medium/tokens.txt",
dataDir = "vits-piper-pt_BR-faber-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx");
vits.setTokens("vits-piper-pt_BR-faber-medium/tokens.txt");
vits.setDataDir("vits-piper-pt_BR-faber-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_BR-faber-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_BR-faber-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_BR-faber-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_BR-faber-medium/pt_BR-faber-medium.onnx",
Tokens: "vits-piper-pt_BR-faber-medium/tokens.txt",
DataDir: "vits-piper-pt_BR-faber-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_BR-jeff-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_BR/jeff/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pt_BR-jeff-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
data_dir="vits-piper-pt_BR-jeff-medium/espeak-ng-data",
tokens="vits-piper-pt_BR-jeff-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx".into()),
tokens: Some("vits-piper-pt_BR-jeff-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pt_BR-jeff-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_BR-jeff-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx',
tokens: 'vits-piper-pt_BR-jeff-medium/tokens.txt',
dataDir: 'vits-piper-pt_BR-jeff-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx',
tokens: 'vits-piper-pt_BR-jeff-medium/tokens.txt',
dataDir: 'vits-piper-pt_BR-jeff-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
lexicon: "",
tokens: "vits-piper-pt_BR-jeff-medium/tokens.txt",
dataDir: "vits-piper-pt_BR-jeff-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
tokens = "vits-piper-pt_BR-jeff-medium/tokens.txt",
dataDir = "vits-piper-pt_BR-jeff-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx");
vits.setTokens("vits-piper-pt_BR-jeff-medium/tokens.txt");
vits.setDataDir("vits-piper-pt_BR-jeff-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_BR-jeff-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_BR-jeff-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_BR-jeff-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_BR-jeff-medium/pt_BR-jeff-medium.onnx",
Tokens: "vits-piper-pt_BR-jeff-medium/tokens.txt",
DataDir: "vits-piper-pt_BR-jeff-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_BR-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pt_BR-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
data_dir="vits-piper-pt_BR-miro-high/espeak-ng-data",
tokens="vits-piper-pt_BR-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx";
config.model.vits.tokens = "vits-piper-pt_BR-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_BR-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx".into()),
tokens: Some("vits-piper-pt_BR-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-pt_BR-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_BR-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx',
tokens: 'vits-piper-pt_BR-miro-high/tokens.txt',
dataDir: 'vits-piper-pt_BR-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx',
tokens: 'vits-piper-pt_BR-miro-high/tokens.txt',
dataDir: 'vits-piper-pt_BR-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-pt_BR-miro-high/tokens.txt",
dataDir: "vits-piper-pt_BR-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_BR-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_BR-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
tokens = "vits-piper-pt_BR-miro-high/tokens.txt",
dataDir = "vits-piper-pt_BR-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx");
vits.setTokens("vits-piper-pt_BR-miro-high/tokens.txt");
vits.setDataDir("vits-piper-pt_BR-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_BR-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_BR-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_BR-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_BR-miro-high/pt_BR-miro-high.onnx",
Tokens: "vits-piper-pt_BR-miro-high/tokens.txt",
DataDir: "vits-piper-pt_BR-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_PT-dii-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_dii
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-dii-high.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx";
config.model.vits.tokens = "vits-piper-pt_PT-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_PT-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-pt_PT-dii-high.tar.bz2
You can use the following code to play with vits-piper-pt_PT-dii-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
data_dir="vits-piper-pt_PT-dii-high/espeak-ng-data",
tokens="vits-piper-pt_PT-dii-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx";
config.model.vits.tokens = "vits-piper-pt_PT-dii-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_PT-dii-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx".into()),
tokens: Some("vits-piper-pt_PT-dii-high/tokens.txt".into()),
data_dir: Some("vits-piper-pt_PT-dii-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_PT-dii-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx',
tokens: 'vits-piper-pt_PT-dii-high/tokens.txt',
dataDir: 'vits-piper-pt_PT-dii-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx',
tokens: 'vits-piper-pt_PT-dii-high/tokens.txt',
dataDir: 'vits-piper-pt_PT-dii-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
lexicon: "",
tokens: "vits-piper-pt_PT-dii-high/tokens.txt",
dataDir: "vits-piper-pt_PT-dii-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_PT-dii-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_PT-dii-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
tokens = "vits-piper-pt_PT-dii-high/tokens.txt",
dataDir = "vits-piper-pt_PT-dii-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx");
vits.setTokens("vits-piper-pt_PT-dii-high/tokens.txt");
vits.setDataDir("vits-piper-pt_PT-dii-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_PT-dii-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_PT-dii-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_PT-dii-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_PT-dii-high/pt_PT-dii-high.onnx",
Tokens: "vits-piper-pt_PT-dii-high/tokens.txt",
DataDir: "vits-piper-pt_PT-dii-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_PT-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx";
config.model.vits.tokens = "vits-piper-pt_PT-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_PT-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pt_PT-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
data_dir="vits-piper-pt_PT-miro-high/espeak-ng-data",
tokens="vits-piper-pt_PT-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx";
config.model.vits.tokens = "vits-piper-pt_PT-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_PT-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx".into()),
tokens: Some("vits-piper-pt_PT-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-pt_PT-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_PT-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx',
tokens: 'vits-piper-pt_PT-miro-high/tokens.txt',
dataDir: 'vits-piper-pt_PT-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx',
tokens: 'vits-piper-pt_PT-miro-high/tokens.txt',
dataDir: 'vits-piper-pt_PT-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-pt_PT-miro-high/tokens.txt",
dataDir: "vits-piper-pt_PT-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_PT-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_PT-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
tokens = "vits-piper-pt_PT-miro-high/tokens.txt",
dataDir = "vits-piper-pt_PT-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx");
vits.setTokens("vits-piper-pt_PT-miro-high/tokens.txt");
vits.setDataDir("vits-piper-pt_PT-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_PT-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_PT-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_PT-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_PT-miro-high/pt_PT-miro-high.onnx",
Tokens: "vits-piper-pt_PT-miro-high/tokens.txt",
DataDir: "vits-piper-pt_PT-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
vits-piper-pt_PT-tugao-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_PT/tugão/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Marinha sem vento, não chega a porto";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-pt_PT-tugao-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
data_dir="vits-piper-pt_PT-tugao-medium/espeak-ng-data",
tokens="vits-piper-pt_PT-tugao-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Marinha sem vento, não chega a porto",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx";
config.model.vits.tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Marinha sem vento, não chega a porto";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx".into()),
tokens: Some("vits-piper-pt_PT-tugao-medium/tokens.txt".into()),
data_dir: Some("vits-piper-pt_PT-tugao-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Marinha sem vento, não chega a porto";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-pt_PT-tugao-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx',
tokens: 'vits-piper-pt_PT-tugao-medium/tokens.txt',
dataDir: 'vits-piper-pt_PT-tugao-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Marinha sem vento, não chega a porto';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx',
tokens: 'vits-piper-pt_PT-tugao-medium/tokens.txt',
dataDir: 'vits-piper-pt_PT-tugao-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Marinha sem vento, não chega a porto', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
lexicon: "",
tokens: "vits-piper-pt_PT-tugao-medium/tokens.txt",
dataDir: "vits-piper-pt_PT-tugao-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Marinha sem vento, não chega a porto"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
tokens = "vits-piper-pt_PT-tugao-medium/tokens.txt",
dataDir = "vits-piper-pt_PT-tugao-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Marinha sem vento, não chega a porto",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx");
vits.setTokens("vits-piper-pt_PT-tugao-medium/tokens.txt");
vits.setDataDir("vits-piper-pt_PT-tugao-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Marinha sem vento, não chega a porto";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-pt_PT-tugao-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-pt_PT-tugao-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Marinha sem vento, não chega a porto', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-pt_PT-tugao-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-pt_PT-tugao-medium/pt_PT-tugao-medium.onnx",
Tokens: "vits-piper-pt_PT-tugao-medium/tokens.txt",
DataDir: "vits-piper-pt_PT-tugao-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Marinha sem vento, não chega a porto"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Marinha sem vento, não chega a porto
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-pt
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Portuguese (pt).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "pt"
audio = tts.generate("Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"pt\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "pt"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "pt"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'pt'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'pt'},
);
final audio = tts.generateWithConfig(text: 'Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "pt"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"pt\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "pt"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"pt\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "pt"}';
Audio := Tts.GenerateWithConfig('Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "pt"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 1
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 2
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 3
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 4
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 5
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 6
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 7
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 8
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Speaker 9
0
Olá mundo.
1
Como você está hoje?
2
O céu é azul.
3
Eu amo aprendizado de máquina.
4
Python é incrível.
5
Bom dia a todos.
6
A inteligência artificial está crescendo.
7
A síntese de voz é fascinante.
8
As redes neurais são poderosas.
9
Texto para voz converte texto em áudio.
10
A rápida raposa marrom salta sobre o cachorro preguiçoso.
11
O aprendizado de máquina permite que computadores aprendam.
12
O processamento de linguagem natural ajuda máquinas a entender.
13
O aprendizado profundo revolucionou a inteligência artificial.
14
A tecnologia de síntese de voz avançou significativamente.
15
A clonagem de voz neural pode replicar estilos de fala.
16
A normalização de texto é importante para pronúncia.
17
Assistentes de voz nos ajudam a interagir com tecnologia.
18
Sistemas TTS modernos usam aprendizado profundo para áudio.
19
A interação humano computador tornou-se mais intuitiva.
Romanian
This section lists text to speech models for Romanian.
vits-piper-ro_RO-mihai-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ro/ro_RO/mihai/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx";
config.model.vits.tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Un foc fără lemne se stinge, o lume fără poveste moare.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ro_RO-mihai-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
data_dir="vits-piper-ro_RO-mihai-medium/espeak-ng-data",
tokens="vits-piper-ro_RO-mihai-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Un foc fără lemne se stinge, o lume fără poveste moare.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx";
config.model.vits.tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Un foc fără lemne se stinge, o lume fără poveste moare.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx".into()),
tokens: Some("vits-piper-ro_RO-mihai-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ro_RO-mihai-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Un foc fără lemne se stinge, o lume fără poveste moare.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ro_RO-mihai-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx',
tokens: 'vits-piper-ro_RO-mihai-medium/tokens.txt',
dataDir: 'vits-piper-ro_RO-mihai-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Un foc fără lemne se stinge, o lume fără poveste moare.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx',
tokens: 'vits-piper-ro_RO-mihai-medium/tokens.txt',
dataDir: 'vits-piper-ro_RO-mihai-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Un foc fără lemne se stinge, o lume fără poveste moare.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
lexicon: "",
tokens: "vits-piper-ro_RO-mihai-medium/tokens.txt",
dataDir: "vits-piper-ro_RO-mihai-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Un foc fără lemne se stinge, o lume fără poveste moare."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Un foc fără lemne se stinge, o lume fără poveste moare.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
tokens = "vits-piper-ro_RO-mihai-medium/tokens.txt",
dataDir = "vits-piper-ro_RO-mihai-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Un foc fără lemne se stinge, o lume fără poveste moare.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx");
vits.setTokens("vits-piper-ro_RO-mihai-medium/tokens.txt");
vits.setDataDir("vits-piper-ro_RO-mihai-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Un foc fără lemne se stinge, o lume fără poveste moare.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ro_RO-mihai-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ro_RO-mihai-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Un foc fără lemne se stinge, o lume fără poveste moare.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ro_RO-mihai-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ro_RO-mihai-medium/ro_RO-mihai-medium.onnx",
Tokens: "vits-piper-ro_RO-mihai-medium/tokens.txt",
DataDir: "vits-piper-ro_RO-mihai-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Un foc fără lemne se stinge, o lume fără poveste moare."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Un foc fără lemne se stinge, o lume fără poveste moare.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-ro
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Romanian (ro).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ro"
audio = tts.generate("Acesta este un motor text to speech care folosește generația următoare de kadi", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Acesta este un motor text to speech care folosește generația următoare de kadi";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"ro\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Acesta este un motor text to speech care folosește generația următoare de kadi";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "ro"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Acesta este un motor text to speech care folosește generația următoare de kadi";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "ro"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Acesta este un motor text to speech care folosește generația următoare de kadi';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'ro'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'ro'},
);
final audio = tts.generateWithConfig(text: 'Acesta este un motor text to speech care folosește generația următoare de kadi', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Acesta este un motor text to speech care folosește generația următoare de kadi"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "ro"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Acesta este un motor text to speech care folosește generația următoare de kadi";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ro\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "ro"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Acesta este un motor text to speech care folosește generația următoare de kadi",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Acesta este un motor text to speech care folosește generația următoare de kadi";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"ro\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "ro"}';
Audio := Tts.GenerateWithConfig('Acesta este un motor text to speech care folosește generația următoare de kadi', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Acesta este un motor text to speech care folosește generația următoare de kadi"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "ro"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 1
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 2
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 3
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 4
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 5
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 6
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 7
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 8
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Speaker 9
0
Salut lume.
1
Cum te simți astăzi?
2
Cerul este albastru, iar vântul este blând.
3
Învățarea automată ajută computerele să învețe din date.
4
Sinteza vocală transformă textul în sunet clar.
5
Elevii au citit o poveste scurtă la bibliotecă.
6
Trenul a întârziat din cauza lucrărilor la șine.
7
Modelele mici rulează rapid pe dispozitive locale.
8
Asistentul vocal ajută la sarcinile zilnice.
9
Citirea stabilă este importantă pentru propoziții scurte și lungi.
Russian
This section lists text to speech models for Russian.
vits-piper-ru_RU-denis-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/denis/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-denis-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-denis-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Если курица укусит, ей отрубят голову.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ru_RU-denis-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
data_dir="vits-piper-ru_RU-denis-medium/espeak-ng-data",
tokens="vits-piper-ru_RU-denis-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-denis-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-denis-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Если курица укусит, ей отрубят голову.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx".into()),
tokens: Some("vits-piper-ru_RU-denis-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ru_RU-denis-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Если курица укусит, ей отрубят голову.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ru_RU-denis-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx',
tokens: 'vits-piper-ru_RU-denis-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-denis-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Если курица укусит, ей отрубят голову.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx',
tokens: 'vits-piper-ru_RU-denis-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-denis-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
lexicon: "",
tokens: "vits-piper-ru_RU-denis-medium/tokens.txt",
dataDir: "vits-piper-ru_RU-denis-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Если курица укусит, ей отрубят голову."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-denis-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-denis-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
tokens = "vits-piper-ru_RU-denis-medium/tokens.txt",
dataDir = "vits-piper-ru_RU-denis-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Если курица укусит, ей отрубят голову.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx");
vits.setTokens("vits-piper-ru_RU-denis-medium/tokens.txt");
vits.setDataDir("vits-piper-ru_RU-denis-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ru_RU-denis-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ru_RU-denis-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ru_RU-denis-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ru_RU-denis-medium/ru_RU-denis-medium.onnx",
Tokens: "vits-piper-ru_RU-denis-medium/tokens.txt",
DataDir: "vits-piper-ru_RU-denis-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Если курица укусит, ей отрубят голову."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Если курица укусит, ей отрубят голову.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ru_RU-dmitri-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/dmitri/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Если курица укусит, ей отрубят голову.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ru_RU-dmitri-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
data_dir="vits-piper-ru_RU-dmitri-medium/espeak-ng-data",
tokens="vits-piper-ru_RU-dmitri-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Если курица укусит, ей отрубят голову.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx".into()),
tokens: Some("vits-piper-ru_RU-dmitri-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ru_RU-dmitri-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Если курица укусит, ей отрубят голову.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx',
tokens: 'vits-piper-ru_RU-dmitri-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-dmitri-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Если курица укусит, ей отрубят голову.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx',
tokens: 'vits-piper-ru_RU-dmitri-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-dmitri-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
lexicon: "",
tokens: "vits-piper-ru_RU-dmitri-medium/tokens.txt",
dataDir: "vits-piper-ru_RU-dmitri-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Если курица укусит, ей отрубят голову."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
tokens = "vits-piper-ru_RU-dmitri-medium/tokens.txt",
dataDir = "vits-piper-ru_RU-dmitri-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Если курица укусит, ей отрубят голову.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx");
vits.setTokens("vits-piper-ru_RU-dmitri-medium/tokens.txt");
vits.setDataDir("vits-piper-ru_RU-dmitri-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ru_RU-dmitri-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ru_RU-dmitri-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ru_RU-dmitri-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ru_RU-dmitri-medium/ru_RU-dmitri-medium.onnx",
Tokens: "vits-piper-ru_RU-dmitri-medium/tokens.txt",
DataDir: "vits-piper-ru_RU-dmitri-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Если курица укусит, ей отрубят голову."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Если курица укусит, ей отрубят голову.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ru_RU-irina-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/irina/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-irina-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-irina-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Если курица укусит, ей отрубят голову.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ru_RU-irina-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
data_dir="vits-piper-ru_RU-irina-medium/espeak-ng-data",
tokens="vits-piper-ru_RU-irina-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-irina-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-irina-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Если курица укусит, ей отрубят голову.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx".into()),
tokens: Some("vits-piper-ru_RU-irina-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ru_RU-irina-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Если курица укусит, ей отрубят голову.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ru_RU-irina-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx',
tokens: 'vits-piper-ru_RU-irina-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-irina-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Если курица укусит, ей отрубят голову.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx',
tokens: 'vits-piper-ru_RU-irina-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-irina-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
lexicon: "",
tokens: "vits-piper-ru_RU-irina-medium/tokens.txt",
dataDir: "vits-piper-ru_RU-irina-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Если курица укусит, ей отрубят голову."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-irina-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-irina-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
tokens = "vits-piper-ru_RU-irina-medium/tokens.txt",
dataDir = "vits-piper-ru_RU-irina-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Если курица укусит, ей отрубят голову.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx");
vits.setTokens("vits-piper-ru_RU-irina-medium/tokens.txt");
vits.setDataDir("vits-piper-ru_RU-irina-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ru_RU-irina-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ru_RU-irina-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ru_RU-irina-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ru_RU-irina-medium/ru_RU-irina-medium.onnx",
Tokens: "vits-piper-ru_RU-irina-medium/tokens.txt",
DataDir: "vits-piper-ru_RU-irina-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Если курица укусит, ей отрубят голову."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Если курица укусит, ей отрубят голову.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-ru_RU-ruslan-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ru/ru_RU/ruslan/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Если курица укусит, ей отрубят голову.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ru_RU-ruslan-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
data_dir="vits-piper-ru_RU-ruslan-medium/espeak-ng-data",
tokens="vits-piper-ru_RU-ruslan-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Если курица укусит, ей отрубят голову.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx";
config.model.vits.tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Если курица укусит, ей отрубят голову.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx".into()),
tokens: Some("vits-piper-ru_RU-ruslan-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ru_RU-ruslan-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Если курица укусит, ей отрубят голову.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx',
tokens: 'vits-piper-ru_RU-ruslan-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-ruslan-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Если курица укусит, ей отрубят голову.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx',
tokens: 'vits-piper-ru_RU-ruslan-medium/tokens.txt',
dataDir: 'vits-piper-ru_RU-ruslan-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Если курица укусит, ей отрубят голову.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
lexicon: "",
tokens: "vits-piper-ru_RU-ruslan-medium/tokens.txt",
dataDir: "vits-piper-ru_RU-ruslan-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Если курица укусит, ей отрубят голову."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
tokens = "vits-piper-ru_RU-ruslan-medium/tokens.txt",
dataDir = "vits-piper-ru_RU-ruslan-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Если курица укусит, ей отрубят голову.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx");
vits.setTokens("vits-piper-ru_RU-ruslan-medium/tokens.txt");
vits.setDataDir("vits-piper-ru_RU-ruslan-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Если курица укусит, ей отрубят голову.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ru_RU-ruslan-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ru_RU-ruslan-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Если курица укусит, ей отрубят голову.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ru_RU-ruslan-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ru_RU-ruslan-medium/ru_RU-ruslan-medium.onnx",
Tokens: "vits-piper-ru_RU-ruslan-medium/tokens.txt",
DataDir: "vits-piper-ru_RU-ruslan-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Если курица укусит, ей отрубят голову."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Если курица укусит, ей отрубят голову.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-ru
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Russian (ru).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "ru"
audio = tts.generate("Это движок преобразования текста в речь, использующий Kaldi следующего поколения.", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"ru\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "ru"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "ru"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Это движок преобразования текста в речь, использующий Kaldi следующего поколения.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'ru'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'ru'},
);
final audio = tts.generateWithConfig(text: 'Это движок преобразования текста в речь, использующий Kaldi следующего поколения.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "ru"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"ru\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "ru"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Это движок преобразования текста в речь, использующий Kaldi следующего поколения.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"ru\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "ru"}';
Audio := Tts.GenerateWithConfig('Это движок преобразования текста в речь, использующий Kaldi следующего поколения.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Это движок преобразования текста в речь, использующий Kaldi следующего поколения."
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "ru"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 1
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 2
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 3
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 4
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 5
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 6
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 7
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 8
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Speaker 9
0
Привет мир.
1
Как у тебя дела сегодня?
2
Небо голубое, а ветер мягкий.
3
Машинное обучение помогает компьютерам учиться на данных.
4
Синтез речи превращает текст в понятный звук.
5
Ученики прочитали короткий рассказ в библиотеке.
6
Поезд задержался из-за ремонта путей.
7
Небольшие модели быстро работают на локальных устройствах.
8
Голосовой помощник помогает в повседневных задачах.
9
Стабильное чтение важно для коротких и длинных предложений.
Serbian
This section lists text to speech models for Serbian.
vits-piper-sr_RS-serbski_institut-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sr/sr_RS/serbski_institut/medium
| Number of speakers | Sample rate |
|---|---|
| 2 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx";
config.model.vits.tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
data_dir="vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data",
tokens="vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Круг не може постојати без свог центра, а нација не може постојати без својих хероја.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx";
config.model.vits.tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx".into()),
tokens: Some("vits-piper-sr_RS-serbski_institut-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx',
tokens: 'vits-piper-sr_RS-serbski_institut-medium/tokens.txt',
dataDir: 'vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Круг не може постојати без свог центра, а нација не може постојати без својих хероја.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx',
tokens: 'vits-piper-sr_RS-serbski_institut-medium/tokens.txt',
dataDir: 'vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Круг не може постојати без свог центра, а нација не може постојати без својих хероја.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
lexicon: "",
tokens: "vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
dataDir: "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
tokens = "vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
dataDir = "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx");
vits.setTokens("vits-piper-sr_RS-serbski_institut-medium/tokens.txt");
vits.setDataDir("vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Круг не може постојати без свог центра, а нација не може постојати без својих хероја.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sr_RS-serbski_institut-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Круг не може постојати без свог центра, а нација не може постојати без својих хероја.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sr_RS-serbski_institut-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sr_RS-serbski_institut-medium/sr_RS-serbski_institut-medium.onnx",
Tokens: "vits-piper-sr_RS-serbski_institut-medium/tokens.txt",
DataDir: "vits-piper-sr_RS-serbski_institut-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Круг не може постојати без свог центра, а нација не може постојати без својих хероја."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Круг не може постојати без свог центра, а нација не може постојати без својих хероја.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Slovak
This section lists text to speech models for Slovak.
vits-piper-sk_SK-lili-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sk/sk_SK/lili/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx";
config.model.vits.tokens = "vits-piper-sk_SK-lili-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sk_SK-lili-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Kto nepozná strach, nepozná vôľu.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sk_SK-lili-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
data_dir="vits-piper-sk_SK-lili-medium/espeak-ng-data",
tokens="vits-piper-sk_SK-lili-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Kto nepozná strach, nepozná vôľu.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx";
config.model.vits.tokens = "vits-piper-sk_SK-lili-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sk_SK-lili-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Kto nepozná strach, nepozná vôľu.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx".into()),
tokens: Some("vits-piper-sk_SK-lili-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sk_SK-lili-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Kto nepozná strach, nepozná vôľu.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sk_SK-lili-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx',
tokens: 'vits-piper-sk_SK-lili-medium/tokens.txt',
dataDir: 'vits-piper-sk_SK-lili-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Kto nepozná strach, nepozná vôľu.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx',
tokens: 'vits-piper-sk_SK-lili-medium/tokens.txt',
dataDir: 'vits-piper-sk_SK-lili-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Kto nepozná strach, nepozná vôľu.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
lexicon: "",
tokens: "vits-piper-sk_SK-lili-medium/tokens.txt",
dataDir: "vits-piper-sk_SK-lili-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Kto nepozná strach, nepozná vôľu."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sk_SK-lili-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sk_SK-lili-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Kto nepozná strach, nepozná vôľu.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
tokens = "vits-piper-sk_SK-lili-medium/tokens.txt",
dataDir = "vits-piper-sk_SK-lili-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Kto nepozná strach, nepozná vôľu.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx");
vits.setTokens("vits-piper-sk_SK-lili-medium/tokens.txt");
vits.setDataDir("vits-piper-sk_SK-lili-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Kto nepozná strach, nepozná vôľu.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sk_SK-lili-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sk_SK-lili-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Kto nepozná strach, nepozná vôľu.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sk_SK-lili-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sk_SK-lili-medium/sk_SK-lili-medium.onnx",
Tokens: "vits-piper-sk_SK-lili-medium/tokens.txt",
DataDir: "vits-piper-sk_SK-lili-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Kto nepozná strach, nepozná vôľu."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Kto nepozná strach, nepozná vôľu.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-sk
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Slovak (sk).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "sk"
audio = tts.generate("Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"sk\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "sk"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "sk"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'sk'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'sk'},
);
final audio = tts.generateWithConfig(text: 'Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "sk"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"sk\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "sk"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"sk\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "sk"}';
Audio := Tts.GenerateWithConfig('Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "sk"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 1
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 2
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 3
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 4
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 5
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 6
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 7
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 8
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Speaker 9
0
Ahoj svet.
1
Ako sa dnes máš?
2
Obloha je modrá a vietor je mierny.
3
Strojové učenie pomáha počítačom učiť sa z dát.
4
Syntéza reči premieňa text na zrozumiteľný zvuk.
5
Žiaci čítali krátky príbeh v knižnici.
6
Vlak meškal pre údržbu trate.
7
Malé modely bežia rýchlo na lokálnych zariadeniach.
8
Hlasový asistent pomáha s každodennými úlohami.
9
Stabilné čítanie je dôležité pre krátke aj dlhé vety.
Slovenian
This section lists text to speech models for Slovenian.
vits-piper-sl_SI-artur-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sl/sl_SI/artur/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx";
config.model.vits.tokens = "vits-piper-sl_SI-artur-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sl_SI-artur-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Kto sa nebojí, nie je hlúpy.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sl_SI-artur-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
data_dir="vits-piper-sl_SI-artur-medium/espeak-ng-data",
tokens="vits-piper-sl_SI-artur-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Kto sa nebojí, nie je hlúpy.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx";
config.model.vits.tokens = "vits-piper-sl_SI-artur-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sl_SI-artur-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Kto sa nebojí, nie je hlúpy.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx".into()),
tokens: Some("vits-piper-sl_SI-artur-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sl_SI-artur-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Kto sa nebojí, nie je hlúpy.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sl_SI-artur-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx',
tokens: 'vits-piper-sl_SI-artur-medium/tokens.txt',
dataDir: 'vits-piper-sl_SI-artur-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Kto sa nebojí, nie je hlúpy.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx',
tokens: 'vits-piper-sl_SI-artur-medium/tokens.txt',
dataDir: 'vits-piper-sl_SI-artur-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Kto sa nebojí, nie je hlúpy.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
lexicon: "",
tokens: "vits-piper-sl_SI-artur-medium/tokens.txt",
dataDir: "vits-piper-sl_SI-artur-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Kto sa nebojí, nie je hlúpy."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sl_SI-artur-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sl_SI-artur-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Kto sa nebojí, nie je hlúpy.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
tokens = "vits-piper-sl_SI-artur-medium/tokens.txt",
dataDir = "vits-piper-sl_SI-artur-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Kto sa nebojí, nie je hlúpy.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx");
vits.setTokens("vits-piper-sl_SI-artur-medium/tokens.txt");
vits.setDataDir("vits-piper-sl_SI-artur-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Kto sa nebojí, nie je hlúpy.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sl_SI-artur-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sl_SI-artur-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Kto sa nebojí, nie je hlúpy.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sl_SI-artur-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sl_SI-artur-medium/sl_SI-artur-medium.onnx",
Tokens: "vits-piper-sl_SI-artur-medium/tokens.txt",
DataDir: "vits-piper-sl_SI-artur-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Kto sa nebojí, nie je hlúpy."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Kto sa nebojí, nie je hlúpy.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-sl
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Slovenian (sl).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "sl"
audio = tts.generate("To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"sl\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "sl"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "sl"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'sl'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'sl'},
);
final audio = tts.generateWithConfig(text: 'To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "sl"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"sl\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "sl"),
)
val audio = tts.generateWithConfigAndCallback(
text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"sl\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "sl"}';
Audio := Tts.GenerateWithConfig('To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "sl"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 1
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 2
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 3
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 4
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 5
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 6
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 7
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 8
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Speaker 9
0
Pozdravljen svet.
1
Kako si danes?
2
Nebo je modro in veter je nežen.
3
Strojno učenje pomaga računalnikom učiti se iz podatkov.
4
Sinteza govora pretvori besedilo v jasen zvok.
5
Učenci so v knjižnici prebrali kratko zgodbo.
6
Vlak je zamujal zaradi vzdrževanja tirov.
7
Majhni modeli hitro delujejo na lokalnih napravah.
8
Glasovni pomočnik pomaga pri vsakodnevnih opravilih.
9
Stabilno branje je pomembno za kratke in dolge stavke.
Spanish
This section lists text to speech models for Spanish.
vits-piper-es_AR-daniela-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_AR/daniela/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx";
config.model.vits.tokens = "vits-piper-es_AR-daniela-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_AR-daniela-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_AR-daniela-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
data_dir="vits-piper-es_AR-daniela-high/espeak-ng-data",
tokens="vits-piper-es_AR-daniela-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx";
config.model.vits.tokens = "vits-piper-es_AR-daniela-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_AR-daniela-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx".into()),
tokens: Some("vits-piper-es_AR-daniela-high/tokens.txt".into()),
data_dir: Some("vits-piper-es_AR-daniela-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_AR-daniela-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx',
tokens: 'vits-piper-es_AR-daniela-high/tokens.txt',
dataDir: 'vits-piper-es_AR-daniela-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx',
tokens: 'vits-piper-es_AR-daniela-high/tokens.txt',
dataDir: 'vits-piper-es_AR-daniela-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
lexicon: "",
tokens: "vits-piper-es_AR-daniela-high/tokens.txt",
dataDir: "vits-piper-es_AR-daniela-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx";
config.Model.Vits.Tokens = "vits-piper-es_AR-daniela-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_AR-daniela-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
tokens = "vits-piper-es_AR-daniela-high/tokens.txt",
dataDir = "vits-piper-es_AR-daniela-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx");
vits.setTokens("vits-piper-es_AR-daniela-high/tokens.txt");
vits.setDataDir("vits-piper-es_AR-daniela-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_AR-daniela-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_AR-daniela-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_AR-daniela-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_AR-daniela-high/es_AR-daniela-high.onnx",
Tokens: "vits-piper-es_AR-daniela-high/tokens.txt",
DataDir: "vits-piper-es_AR-daniela-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-es_ES-carlfm-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_ES/carlfm/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx";
config.model.vits.tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_ES-carlfm-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
data_dir="vits-piper-es_ES-carlfm-x_low/espeak-ng-data",
tokens="vits-piper-es_ES-carlfm-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx";
config.model.vits.tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx".into()),
tokens: Some("vits-piper-es_ES-carlfm-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-es_ES-carlfm-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx',
tokens: 'vits-piper-es_ES-carlfm-x_low/tokens.txt',
dataDir: 'vits-piper-es_ES-carlfm-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx',
tokens: 'vits-piper-es_ES-carlfm-x_low/tokens.txt',
dataDir: 'vits-piper-es_ES-carlfm-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
lexicon: "",
tokens: "vits-piper-es_ES-carlfm-x_low/tokens.txt",
dataDir: "vits-piper-es_ES-carlfm-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
tokens = "vits-piper-es_ES-carlfm-x_low/tokens.txt",
dataDir = "vits-piper-es_ES-carlfm-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx");
vits.setTokens("vits-piper-es_ES-carlfm-x_low/tokens.txt");
vits.setDataDir("vits-piper-es_ES-carlfm-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_ES-carlfm-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_ES-carlfm-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_ES-carlfm-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_ES-carlfm-x_low/es_ES-carlfm-x_low.onnx",
Tokens: "vits-piper-es_ES-carlfm-x_low/tokens.txt",
DataDir: "vits-piper-es_ES-carlfm-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-es_ES-davefx-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_ES/davefx/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx";
config.model.vits.tokens = "vits-piper-es_ES-davefx-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-davefx-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_ES-davefx-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
data_dir="vits-piper-es_ES-davefx-medium/espeak-ng-data",
tokens="vits-piper-es_ES-davefx-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx";
config.model.vits.tokens = "vits-piper-es_ES-davefx-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-davefx-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx".into()),
tokens: Some("vits-piper-es_ES-davefx-medium/tokens.txt".into()),
data_dir: Some("vits-piper-es_ES-davefx-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_ES-davefx-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx',
tokens: 'vits-piper-es_ES-davefx-medium/tokens.txt',
dataDir: 'vits-piper-es_ES-davefx-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx',
tokens: 'vits-piper-es_ES-davefx-medium/tokens.txt',
dataDir: 'vits-piper-es_ES-davefx-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
lexicon: "",
tokens: "vits-piper-es_ES-davefx-medium/tokens.txt",
dataDir: "vits-piper-es_ES-davefx-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-davefx-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-davefx-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
tokens = "vits-piper-es_ES-davefx-medium/tokens.txt",
dataDir = "vits-piper-es_ES-davefx-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx");
vits.setTokens("vits-piper-es_ES-davefx-medium/tokens.txt");
vits.setDataDir("vits-piper-es_ES-davefx-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_ES-davefx-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_ES-davefx-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_ES-davefx-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_ES-davefx-medium/es_ES-davefx-medium.onnx",
Tokens: "vits-piper-es_ES-davefx-medium/tokens.txt",
DataDir: "vits-piper-es_ES-davefx-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-es_ES-glados-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://github.com/rhasspy/piper/issues/187#issuecomment-1802216304
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx";
config.model.vits.tokens = "vits-piper-es_ES-glados-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-glados-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_ES-glados-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
data_dir="vits-piper-es_ES-glados-medium/espeak-ng-data",
tokens="vits-piper-es_ES-glados-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx";
config.model.vits.tokens = "vits-piper-es_ES-glados-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-glados-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx".into()),
tokens: Some("vits-piper-es_ES-glados-medium/tokens.txt".into()),
data_dir: Some("vits-piper-es_ES-glados-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_ES-glados-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx',
tokens: 'vits-piper-es_ES-glados-medium/tokens.txt',
dataDir: 'vits-piper-es_ES-glados-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx',
tokens: 'vits-piper-es_ES-glados-medium/tokens.txt',
dataDir: 'vits-piper-es_ES-glados-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
lexicon: "",
tokens: "vits-piper-es_ES-glados-medium/tokens.txt",
dataDir: "vits-piper-es_ES-glados-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-glados-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-glados-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
tokens = "vits-piper-es_ES-glados-medium/tokens.txt",
dataDir = "vits-piper-es_ES-glados-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx");
vits.setTokens("vits-piper-es_ES-glados-medium/tokens.txt");
vits.setDataDir("vits-piper-es_ES-glados-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_ES-glados-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_ES-glados-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_ES-glados-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_ES-glados-medium/es_ES-glados-medium.onnx",
Tokens: "vits-piper-es_ES-glados-medium/tokens.txt",
DataDir: "vits-piper-es_ES-glados-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-es_ES-miro-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/OpenVoiceOS/pipertts_es-ES_miro
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx";
config.model.vits.tokens = "vits-piper-es_ES-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_ES-miro-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
data_dir="vits-piper-es_ES-miro-high/espeak-ng-data",
tokens="vits-piper-es_ES-miro-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx";
config.model.vits.tokens = "vits-piper-es_ES-miro-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-miro-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_ES-miro-high/es_ES-miro-high.onnx".into()),
tokens: Some("vits-piper-es_ES-miro-high/tokens.txt".into()),
data_dir: Some("vits-piper-es_ES-miro-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_ES-miro-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_ES-miro-high/es_ES-miro-high.onnx',
tokens: 'vits-piper-es_ES-miro-high/tokens.txt',
dataDir: 'vits-piper-es_ES-miro-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_ES-miro-high/es_ES-miro-high.onnx',
tokens: 'vits-piper-es_ES-miro-high/tokens.txt',
dataDir: 'vits-piper-es_ES-miro-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
lexicon: "",
tokens: "vits-piper-es_ES-miro-high/tokens.txt",
dataDir: "vits-piper-es_ES-miro-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-miro-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-miro-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
tokens = "vits-piper-es_ES-miro-high/tokens.txt",
dataDir = "vits-piper-es_ES-miro-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_ES-miro-high/es_ES-miro-high.onnx");
vits.setTokens("vits-piper-es_ES-miro-high/tokens.txt");
vits.setDataDir("vits-piper-es_ES-miro-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_ES-miro-high/es_ES-miro-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_ES-miro-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_ES-miro-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_ES-miro-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_ES-miro-high/es_ES-miro-high.onnx",
Tokens: "vits-piper-es_ES-miro-high/tokens.txt",
DataDir: "vits-piper-es_ES-miro-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-es_ES-sharvard-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_ES/sharvard/medium
| Number of speakers | Sample rate |
|---|---|
| 2 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx";
config.model.vits.tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_ES-sharvard-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
data_dir="vits-piper-es_ES-sharvard-medium/espeak-ng-data",
tokens="vits-piper-es_ES-sharvard-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx";
config.model.vits.tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx".into()),
tokens: Some("vits-piper-es_ES-sharvard-medium/tokens.txt".into()),
data_dir: Some("vits-piper-es_ES-sharvard-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_ES-sharvard-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx',
tokens: 'vits-piper-es_ES-sharvard-medium/tokens.txt',
dataDir: 'vits-piper-es_ES-sharvard-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx',
tokens: 'vits-piper-es_ES-sharvard-medium/tokens.txt',
dataDir: 'vits-piper-es_ES-sharvard-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
lexicon: "",
tokens: "vits-piper-es_ES-sharvard-medium/tokens.txt",
dataDir: "vits-piper-es_ES-sharvard-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
tokens = "vits-piper-es_ES-sharvard-medium/tokens.txt",
dataDir = "vits-piper-es_ES-sharvard-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx");
vits.setTokens("vits-piper-es_ES-sharvard-medium/tokens.txt");
vits.setDataDir("vits-piper-es_ES-sharvard-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_ES-sharvard-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_ES-sharvard-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_ES-sharvard-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_ES-sharvard-medium/es_ES-sharvard-medium.onnx",
Tokens: "vits-piper-es_ES-sharvard-medium/tokens.txt",
DataDir: "vits-piper-es_ES-sharvard-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
vits-piper-es_MX-ald-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_MX/ald/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx";
config.model.vits.tokens = "vits-piper-es_MX-ald-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_MX-ald-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_MX-ald-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
data_dir="vits-piper-es_MX-ald-medium/espeak-ng-data",
tokens="vits-piper-es_MX-ald-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx";
config.model.vits.tokens = "vits-piper-es_MX-ald-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_MX-ald-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx".into()),
tokens: Some("vits-piper-es_MX-ald-medium/tokens.txt".into()),
data_dir: Some("vits-piper-es_MX-ald-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_MX-ald-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx',
tokens: 'vits-piper-es_MX-ald-medium/tokens.txt',
dataDir: 'vits-piper-es_MX-ald-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx',
tokens: 'vits-piper-es_MX-ald-medium/tokens.txt',
dataDir: 'vits-piper-es_MX-ald-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
lexicon: "",
tokens: "vits-piper-es_MX-ald-medium/tokens.txt",
dataDir: "vits-piper-es_MX-ald-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-es_MX-ald-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_MX-ald-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
tokens = "vits-piper-es_MX-ald-medium/tokens.txt",
dataDir = "vits-piper-es_MX-ald-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx");
vits.setTokens("vits-piper-es_MX-ald-medium/tokens.txt");
vits.setDataDir("vits-piper-es_MX-ald-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_MX-ald-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_MX-ald-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_MX-ald-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_MX-ald-medium/es_MX-ald-medium.onnx",
Tokens: "vits-piper-es_MX-ald-medium/tokens.txt",
DataDir: "vits-piper-es_MX-ald-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-es_MX-claude-high
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/es/es_MX/claude/high
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx";
config.model.vits.tokens = "vits-piper-es_MX-claude-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_MX-claude-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-es_MX-claude-high
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
data_dir="vits-piper-es_MX-claude-high/espeak-ng-data",
tokens="vits-piper-es_MX-claude-high/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx";
config.model.vits.tokens = "vits-piper-es_MX-claude-high/tokens.txt";
config.model.vits.data_dir = "vits-piper-es_MX-claude-high/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-es_MX-claude-high/es_MX-claude-high.onnx".into()),
tokens: Some("vits-piper-es_MX-claude-high/tokens.txt".into()),
data_dir: Some("vits-piper-es_MX-claude-high/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-es_MX-claude-high with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-es_MX-claude-high/es_MX-claude-high.onnx',
tokens: 'vits-piper-es_MX-claude-high/tokens.txt',
dataDir: 'vits-piper-es_MX-claude-high/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-es_MX-claude-high/es_MX-claude-high.onnx',
tokens: 'vits-piper-es_MX-claude-high/tokens.txt',
dataDir: 'vits-piper-es_MX-claude-high/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
lexicon: "",
tokens: "vits-piper-es_MX-claude-high/tokens.txt",
dataDir: "vits-piper-es_MX-claude-high/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx";
config.Model.Vits.Tokens = "vits-piper-es_MX-claude-high/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-es_MX-claude-high/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
tokens = "vits-piper-es_MX-claude-high/tokens.txt",
dataDir = "vits-piper-es_MX-claude-high/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-es_MX-claude-high/es_MX-claude-high.onnx");
vits.setTokens("vits-piper-es_MX-claude-high/tokens.txt");
vits.setDataDir("vits-piper-es_MX-claude-high/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-es_MX-claude-high/es_MX-claude-high.onnx';
Config.Model.Vits.Tokens := 'vits-piper-es_MX-claude-high/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-es_MX-claude-high/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-es_MX-claude-high with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-es_MX-claude-high/es_MX-claude-high.onnx",
Tokens: "vits-piper-es_MX-claude-high/tokens.txt",
DataDir: "vits-piper-es_MX-claude-high/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-es
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Spanish (es).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "es"
audio = tts.generate("Este es un motor de texto a voz que utiliza kaldi de próxima generación.", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"es\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "es"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "es"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Este es un motor de texto a voz que utiliza kaldi de próxima generación.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'es'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'es'},
);
final audio = tts.generateWithConfig(text: 'Este es un motor de texto a voz que utiliza kaldi de próxima generación.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "es"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"es\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "es"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Este es un motor de texto a voz que utiliza kaldi de próxima generación.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"es\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "es"}';
Audio := Tts.GenerateWithConfig('Este es un motor de texto a voz que utiliza kaldi de próxima generación.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Este es un motor de texto a voz que utiliza kaldi de próxima generación."
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "es"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 1
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 2
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 3
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 4
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 5
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 6
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 7
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 8
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Speaker 9
0
Hola mundo.
1
¿Cómo estás hoy?
2
El cielo es azul.
3
Me encanta el aprendizaje automático.
4
Python es increíble.
5
Buenos días a todos.
6
La inteligencia artificial está creciendo.
7
La síntesis de voz es fascinante.
8
Las redes neuronales son poderosas.
9
El texto a voz convierte texto en audio.
10
El veloz marrón salta sobre el perro perezoso.
11
El aprendizaje automático permite a las computadoras aprender.
12
El procesamiento del lenguaje natural ayuda a las máquinas.
13
El aprendizaje profundo ha revolucionado la inteligencia artificial.
14
La tecnología de síntesis de voz ha avanzado significativamente.
15
La clonación de voz neuronal puede replicar estilos de habla.
16
La normalización de texto es importante para la pronunciación.
17
Los asistentes de voz nos ayudan a interactuar con la tecnología.
18
Los sistemas TTS modernos utilizan aprendizaje profundo.
19
La interacción humano computadora se ha vuelto más intuitiva.
Swahili
This section lists text to speech models for Swahili.
vits-piper-sw_CD-lanfrica-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sw/sw_CD/lanfrica/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx";
config.model.vits.tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Mtu mmoja hawezi kuiba mazingira.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
data_dir="vits-piper-sw_CD-lanfrica-medium/espeak-ng-data",
tokens="vits-piper-sw_CD-lanfrica-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Mtu mmoja hawezi kuiba mazingira.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx";
config.model.vits.tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Mtu mmoja hawezi kuiba mazingira.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx".into()),
tokens: Some("vits-piper-sw_CD-lanfrica-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sw_CD-lanfrica-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Mtu mmoja hawezi kuiba mazingira.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx',
tokens: 'vits-piper-sw_CD-lanfrica-medium/tokens.txt',
dataDir: 'vits-piper-sw_CD-lanfrica-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Mtu mmoja hawezi kuiba mazingira.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx',
tokens: 'vits-piper-sw_CD-lanfrica-medium/tokens.txt',
dataDir: 'vits-piper-sw_CD-lanfrica-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Mtu mmoja hawezi kuiba mazingira.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
lexicon: "",
tokens: "vits-piper-sw_CD-lanfrica-medium/tokens.txt",
dataDir: "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Mtu mmoja hawezi kuiba mazingira."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Mtu mmoja hawezi kuiba mazingira.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
tokens = "vits-piper-sw_CD-lanfrica-medium/tokens.txt",
dataDir = "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Mtu mmoja hawezi kuiba mazingira.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx");
vits.setTokens("vits-piper-sw_CD-lanfrica-medium/tokens.txt");
vits.setDataDir("vits-piper-sw_CD-lanfrica-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Mtu mmoja hawezi kuiba mazingira.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sw_CD-lanfrica-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sw_CD-lanfrica-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Mtu mmoja hawezi kuiba mazingira.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sw_CD-lanfrica-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sw_CD-lanfrica-medium/sw_CD-lanfrica-medium.onnx",
Tokens: "vits-piper-sw_CD-lanfrica-medium/tokens.txt",
DataDir: "vits-piper-sw_CD-lanfrica-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Mtu mmoja hawezi kuiba mazingira."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Mtu mmoja hawezi kuiba mazingira.
sample audios for different speakers are listed below:
Speaker 0
Swedish
This section lists text to speech models for Swedish.
vits-piper-sv_SE-alma-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sv/sv_SE/alma/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx";
config.model.vits.tokens = "vits-piper-sv_SE-alma-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sv_SE-alma-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Liten skog, med många träd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sv_SE-alma-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
data_dir="vits-piper-sv_SE-alma-medium/espeak-ng-data",
tokens="vits-piper-sv_SE-alma-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Liten skog, med många träd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx";
config.model.vits.tokens = "vits-piper-sv_SE-alma-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sv_SE-alma-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Liten skog, med många träd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx".into()),
tokens: Some("vits-piper-sv_SE-alma-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sv_SE-alma-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Liten skog, med många träd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sv_SE-alma-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx',
tokens: 'vits-piper-sv_SE-alma-medium/tokens.txt',
dataDir: 'vits-piper-sv_SE-alma-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Liten skog, med många träd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx',
tokens: 'vits-piper-sv_SE-alma-medium/tokens.txt',
dataDir: 'vits-piper-sv_SE-alma-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Liten skog, med många träd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
lexicon: "",
tokens: "vits-piper-sv_SE-alma-medium/tokens.txt",
dataDir: "vits-piper-sv_SE-alma-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Liten skog, med många träd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sv_SE-alma-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sv_SE-alma-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
tokens = "vits-piper-sv_SE-alma-medium/tokens.txt",
dataDir = "vits-piper-sv_SE-alma-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Liten skog, med många träd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx");
vits.setTokens("vits-piper-sv_SE-alma-medium/tokens.txt");
vits.setDataDir("vits-piper-sv_SE-alma-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sv_SE-alma-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sv_SE-alma-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Liten skog, med många träd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sv_SE-alma-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sv_SE-alma-medium/sv_SE-alma-medium.onnx",
Tokens: "vits-piper-sv_SE-alma-medium/tokens.txt",
DataDir: "vits-piper-sv_SE-alma-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Liten skog, med många träd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Liten skog, med många träd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-sv_SE-lisa-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sv/sv_SE/lisa/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx";
config.model.vits.tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Liten skog, med många träd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sv_SE-lisa-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
data_dir="vits-piper-sv_SE-lisa-medium/espeak-ng-data",
tokens="vits-piper-sv_SE-lisa-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Liten skog, med många träd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx";
config.model.vits.tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Liten skog, med många träd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx".into()),
tokens: Some("vits-piper-sv_SE-lisa-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sv_SE-lisa-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Liten skog, med många träd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sv_SE-lisa-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx',
tokens: 'vits-piper-sv_SE-lisa-medium/tokens.txt',
dataDir: 'vits-piper-sv_SE-lisa-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Liten skog, med många träd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx',
tokens: 'vits-piper-sv_SE-lisa-medium/tokens.txt',
dataDir: 'vits-piper-sv_SE-lisa-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Liten skog, med många träd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
lexicon: "",
tokens: "vits-piper-sv_SE-lisa-medium/tokens.txt",
dataDir: "vits-piper-sv_SE-lisa-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Liten skog, med många träd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
tokens = "vits-piper-sv_SE-lisa-medium/tokens.txt",
dataDir = "vits-piper-sv_SE-lisa-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Liten skog, med många träd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx");
vits.setTokens("vits-piper-sv_SE-lisa-medium/tokens.txt");
vits.setDataDir("vits-piper-sv_SE-lisa-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sv_SE-lisa-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sv_SE-lisa-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Liten skog, med många träd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sv_SE-lisa-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sv_SE-lisa-medium/sv_SE-lisa-medium.onnx",
Tokens: "vits-piper-sv_SE-lisa-medium/tokens.txt",
DataDir: "vits-piper-sv_SE-lisa-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Liten skog, med många träd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Liten skog, med många träd
sample audios for different speakers are listed below:
Speaker 0
vits-piper-sv_SE-nst-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/sv/sv_SE/nst/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx";
config.model.vits.tokens = "vits-piper-sv_SE-nst-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sv_SE-nst-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Liten skog, med många träd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-sv_SE-nst-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
data_dir="vits-piper-sv_SE-nst-medium/espeak-ng-data",
tokens="vits-piper-sv_SE-nst-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Liten skog, med många träd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx";
config.model.vits.tokens = "vits-piper-sv_SE-nst-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-sv_SE-nst-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Liten skog, med många träd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx".into()),
tokens: Some("vits-piper-sv_SE-nst-medium/tokens.txt".into()),
data_dir: Some("vits-piper-sv_SE-nst-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Liten skog, med många träd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-sv_SE-nst-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx',
tokens: 'vits-piper-sv_SE-nst-medium/tokens.txt',
dataDir: 'vits-piper-sv_SE-nst-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Liten skog, med många träd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx',
tokens: 'vits-piper-sv_SE-nst-medium/tokens.txt',
dataDir: 'vits-piper-sv_SE-nst-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Liten skog, med många träd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
lexicon: "",
tokens: "vits-piper-sv_SE-nst-medium/tokens.txt",
dataDir: "vits-piper-sv_SE-nst-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Liten skog, med många träd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-sv_SE-nst-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-sv_SE-nst-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
tokens = "vits-piper-sv_SE-nst-medium/tokens.txt",
dataDir = "vits-piper-sv_SE-nst-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Liten skog, med många träd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx");
vits.setTokens("vits-piper-sv_SE-nst-medium/tokens.txt");
vits.setDataDir("vits-piper-sv_SE-nst-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Liten skog, med många träd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-sv_SE-nst-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-sv_SE-nst-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Liten skog, med många träd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-sv_SE-nst-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-sv_SE-nst-medium/sv_SE-nst-medium.onnx",
Tokens: "vits-piper-sv_SE-nst-medium/tokens.txt",
DataDir: "vits-piper-sv_SE-nst-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Liten skog, med många träd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Liten skog, med många träd
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-sv
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Swedish (sv).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "sv"
audio = tts.generate("Detta är en text till tal-motor som använder nästa generations kaldi", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Detta är en text till tal-motor som använder nästa generations kaldi";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"sv\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Detta är en text till tal-motor som använder nästa generations kaldi";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "sv"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Detta är en text till tal-motor som använder nästa generations kaldi";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "sv"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Detta är en text till tal-motor som använder nästa generations kaldi';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'sv'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'sv'},
);
final audio = tts.generateWithConfig(text: 'Detta är en text till tal-motor som använder nästa generations kaldi', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Detta är en text till tal-motor som använder nästa generations kaldi"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "sv"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Detta är en text till tal-motor som använder nästa generations kaldi";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"sv\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "sv"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Detta är en text till tal-motor som använder nästa generations kaldi",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Detta är en text till tal-motor som använder nästa generations kaldi";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"sv\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "sv"}';
Audio := Tts.GenerateWithConfig('Detta är en text till tal-motor som använder nästa generations kaldi', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Detta är en text till tal-motor som använder nästa generations kaldi"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "sv"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 1
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 2
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 3
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 4
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 5
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 6
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 7
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 8
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Speaker 9
0
Hej världen.
1
Hur mår du idag?
2
Himlen är blå och vinden är mild.
3
Maskininlärning hjälper datorer att lära sig av data.
4
Talsyntes omvandlar text till tydligt ljud.
5
Eleverna läste en kort berättelse på biblioteket.
6
Tåget blev försenat på grund av spårunderhåll.
7
Små modeller kör snabbt på lokala enheter.
8
En röstassistent hjälper till med vardagliga uppgifter.
9
Stabil uppläsning är viktig för korta och långa meningar.
Turkish
This section lists text to speech models for Turkish.
vits-piper-tr_TR-dfki-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/tr/tr_TR/dfki/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx";
config.model.vits.tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-tr_TR-dfki-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
data_dir="vits-piper-tr_TR-dfki-medium/espeak-ng-data",
tokens="vits-piper-tr_TR-dfki-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx";
config.model.vits.tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx".into()),
tokens: Some("vits-piper-tr_TR-dfki-medium/tokens.txt".into()),
data_dir: Some("vits-piper-tr_TR-dfki-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-tr_TR-dfki-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx',
tokens: 'vits-piper-tr_TR-dfki-medium/tokens.txt',
dataDir: 'vits-piper-tr_TR-dfki-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx',
tokens: 'vits-piper-tr_TR-dfki-medium/tokens.txt',
dataDir: 'vits-piper-tr_TR-dfki-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
lexicon: "",
tokens: "vits-piper-tr_TR-dfki-medium/tokens.txt",
dataDir: "vits-piper-tr_TR-dfki-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
tokens = "vits-piper-tr_TR-dfki-medium/tokens.txt",
dataDir = "vits-piper-tr_TR-dfki-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx");
vits.setTokens("vits-piper-tr_TR-dfki-medium/tokens.txt");
vits.setDataDir("vits-piper-tr_TR-dfki-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-tr_TR-dfki-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-tr_TR-dfki-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-tr_TR-dfki-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-tr_TR-dfki-medium/tr_TR-dfki-medium.onnx",
Tokens: "vits-piper-tr_TR-dfki-medium/tokens.txt",
DataDir: "vits-piper-tr_TR-dfki-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz
sample audios for different speakers are listed below:
Speaker 0
supertonic-3-tr
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Turkish (tr).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "tr"
audio = tts.generate("Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"tr\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "tr"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "tr"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'tr'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'tr'},
);
final audio = tts.generateWithConfig(text: 'Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "tr"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"tr\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "tr"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"tr\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "tr"}';
Audio := Tts.GenerateWithConfig('Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "tr"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 1
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 2
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 3
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 4
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 5
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 6
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 7
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 8
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Speaker 9
0
Merhaba dünya.
1
Bugün nasılsın?
2
Gökyüzü mavi ve rüzgar hafif.
3
Makine öğrenimi bilgisayarların verilerden öğrenmesine yardımcı olur.
4
Konuşma sentezi metni anlaşılır sese dönüştürür.
5
Öğrenciler kütüphanede kısa bir hikaye okudu.
6
Tren ray bakımı nedeniyle gecikti.
7
Küçük modeller yerel cihazlarda hızlı çalışır.
8
Sesli asistan günlük işlerde yardımcı olur.
9
Kararlı okuma kısa ve uzun cümleler için önemlidir.
Ukrainian
This section lists text to speech models for Ukrainian.
vits-piper-uk_UA-lada-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/uk/uk_UA/lada/x_low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx";
config.model.vits.tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-uk_UA-lada-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
data_dir="vits-piper-uk_UA-lada-x_low/espeak-ng-data",
tokens="vits-piper-uk_UA-lada-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ви не можете навчити коня, якщо не відвикнете від годівлі.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx";
config.model.vits.tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx".into()),
tokens: Some("vits-piper-uk_UA-lada-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-uk_UA-lada-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-uk_UA-lada-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx',
tokens: 'vits-piper-uk_UA-lada-x_low/tokens.txt',
dataDir: 'vits-piper-uk_UA-lada-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ви не можете навчити коня, якщо не відвикнете від годівлі.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx',
tokens: 'vits-piper-uk_UA-lada-x_low/tokens.txt',
dataDir: 'vits-piper-uk_UA-lada-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ви не можете навчити коня, якщо не відвикнете від годівлі.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
lexicon: "",
tokens: "vits-piper-uk_UA-lada-x_low/tokens.txt",
dataDir: "vits-piper-uk_UA-lada-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
tokens = "vits-piper-uk_UA-lada-x_low/tokens.txt",
dataDir = "vits-piper-uk_UA-lada-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx");
vits.setTokens("vits-piper-uk_UA-lada-x_low/tokens.txt");
vits.setDataDir("vits-piper-uk_UA-lada-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-uk_UA-lada-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-uk_UA-lada-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ви не можете навчити коня, якщо не відвикнете від годівлі.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-uk_UA-lada-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-uk_UA-lada-x_low/uk_UA-lada-x_low.onnx",
Tokens: "vits-piper-uk_UA-lada-x_low/tokens.txt",
DataDir: "vits-piper-uk_UA-lada-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ви не можете навчити коня, якщо не відвикнете від годівлі."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ви не можете навчити коня, якщо не відвикнете від годівлі.
sample audios for different speakers are listed below:
Speaker 0
vits-piper-uk_UA-ukrainian_tts-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/uk/uk_UA/ukrainian_tts/medium
| Number of speakers | Sample rate |
|---|---|
| 3 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx";
config.model.vits.tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
data_dir="vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data",
tokens="vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ви не можете навчити коня, якщо не відвикнете від годівлі.",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx";
config.model.vits.tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx".into()),
tokens: Some("vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt".into()),
data_dir: Some("vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx',
tokens: 'vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt',
dataDir: 'vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ви не можете навчити коня, якщо не відвикнете від годівлі.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx',
tokens: 'vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt',
dataDir: 'vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ви не можете навчити коня, якщо не відвикнете від годівлі.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
lexicon: "",
tokens: "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
dataDir: "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ви не можете навчити коня, якщо не відвикнете від годівлі."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
tokens = "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
dataDir = "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx");
vits.setTokens("vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt");
vits.setDataDir("vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ви не можете навчити коня, якщо не відвикнете від годівлі.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ви не можете навчити коня, якщо не відвикнете від годівлі.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-uk_UA-ukrainian_tts-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-uk_UA-ukrainian_tts-medium/uk_UA-ukrainian_tts-medium.onnx",
Tokens: "vits-piper-uk_UA-ukrainian_tts-medium/tokens.txt",
DataDir: "vits-piper-uk_UA-ukrainian_tts-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ви не можете навчити коня, якщо не відвикнете від годівлі."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ви не можете навчити коня, якщо не відвикнете від годівлі.
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
supertonic-3-uk
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Ukrainian (uk).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "uk"
audio = tts.generate("Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"uk\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "uk"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "uk"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'uk'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'uk'},
);
final audio = tts.generateWithConfig(text: 'Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "uk"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"uk\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "uk"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"uk\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "uk"}';
Audio := Tts.GenerateWithConfig('Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "uk"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 1
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 2
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 3
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 4
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 5
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 6
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 7
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 8
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Speaker 9
0
Привіт світе.
1
Як ти сьогодні?
2
Небо блакитне, а вітер лагідний.
3
Машинне навчання допомагає комп’ютерам вчитися на даних.
4
Синтез мовлення перетворює текст на зрозумілий звук.
5
Учні прочитали коротку історію в бібліотеці.
6
Потяг затримався через ремонт колії.
7
Невеликі моделі швидко працюють на локальних пристроях.
8
Голосовий помічник допомагає з щоденними завданнями.
9
Стабільне читання важливе для коротких і довгих речень.
Urdu
This section lists text to speech models for Urdu.
vits-piper-ur_PK-fasih-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/ur/ur_PK/fasih/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx";
config.model.vits.tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-ur_PK-fasih-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
data_dir="vits-piper-ur_PK-fasih-medium/espeak-ng-data",
tokens="vits-piper-ur_PK-fasih-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx";
config.model.vits.tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx".into()),
tokens: Some("vits-piper-ur_PK-fasih-medium/tokens.txt".into()),
data_dir: Some("vits-piper-ur_PK-fasih-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-ur_PK-fasih-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx',
tokens: 'vits-piper-ur_PK-fasih-medium/tokens.txt',
dataDir: 'vits-piper-ur_PK-fasih-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx',
tokens: 'vits-piper-ur_PK-fasih-medium/tokens.txt',
dataDir: 'vits-piper-ur_PK-fasih-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
lexicon: "",
tokens: "vits-piper-ur_PK-fasih-medium/tokens.txt",
dataDir: "vits-piper-ur_PK-fasih-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
tokens = "vits-piper-ur_PK-fasih-medium/tokens.txt",
dataDir = "vits-piper-ur_PK-fasih-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx");
vits.setTokens("vits-piper-ur_PK-fasih-medium/tokens.txt");
vits.setDataDir("vits-piper-ur_PK-fasih-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-ur_PK-fasih-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-ur_PK-fasih-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-ur_PK-fasih-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-ur_PK-fasih-medium/ur_PK-fasih-medium.onnx",
Tokens: "vits-piper-ur_PK-fasih-medium/tokens.txt",
DataDir: "vits-piper-ur_PK-fasih-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
قوس قزح، جسے قوس قزح یا رنگوں کی قوس قزح بھی کہا جاتا ہے، ایک قدرتی طبعی رجحان ہے جو بارش کے قطرے کے ذریعے سورج کی روشنی کے اضطراب اور پھیلاؤ کے نتیجے میں ہوتا ہے۔
sample audios for different speakers are listed below:
Speaker 0
Vietnamese
This section lists text to speech models for Vietnamese.
vits-piper-vi_VN-25hours_single-low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/vi/vi_VN/25hours_single/low
| Number of speakers | Sample rate |
|---|---|
| 1 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx";
config.model.vits.tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-vi_VN-25hours_single-low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
data_dir="vits-piper-vi_VN-25hours_single-low/espeak-ng-data",
tokens="vits-piper-vi_VN-25hours_single-low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nước cũ đào gỗ mới, sông cũ chảy nước mới",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx";
config.model.vits.tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt";
config.model.vits.data_dir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx".into()),
tokens: Some("vits-piper-vi_VN-25hours_single-low/tokens.txt".into()),
data_dir: Some("vits-piper-vi_VN-25hours_single-low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx',
tokens: 'vits-piper-vi_VN-25hours_single-low/tokens.txt',
dataDir: 'vits-piper-vi_VN-25hours_single-low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nước cũ đào gỗ mới, sông cũ chảy nước mới';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx',
tokens: 'vits-piper-vi_VN-25hours_single-low/tokens.txt',
dataDir: 'vits-piper-vi_VN-25hours_single-low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nước cũ đào gỗ mới, sông cũ chảy nước mới', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
lexicon: "",
tokens: "vits-piper-vi_VN-25hours_single-low/tokens.txt",
dataDir: "vits-piper-vi_VN-25hours_single-low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx";
config.Model.Vits.Tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
tokens = "vits-piper-vi_VN-25hours_single-low/tokens.txt",
dataDir = "vits-piper-vi_VN-25hours_single-low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx");
vits.setTokens("vits-piper-vi_VN-25hours_single-low/tokens.txt");
vits.setDataDir("vits-piper-vi_VN-25hours_single-low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-vi_VN-25hours_single-low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-vi_VN-25hours_single-low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nước cũ đào gỗ mới, sông cũ chảy nước mới', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-vi_VN-25hours_single-low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-vi_VN-25hours_single-low/vi_VN-25hours_single-low.onnx",
Tokens: "vits-piper-vi_VN-25hours_single-low/tokens.txt",
DataDir: "vits-piper-vi_VN-25hours_single-low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nước cũ đào gỗ mới, sông cũ chảy nước mới
sample audios for different speakers are listed below:
Speaker 0
vits-piper-vi_VN-vais1000-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/vi/vi_VN/vais1000/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx";
config.model.vits.tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-vi_VN-vais1000-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
data_dir="vits-piper-vi_VN-vais1000-medium/espeak-ng-data",
tokens="vits-piper-vi_VN-vais1000-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nước cũ đào gỗ mới, sông cũ chảy nước mới",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx";
config.model.vits.tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx".into()),
tokens: Some("vits-piper-vi_VN-vais1000-medium/tokens.txt".into()),
data_dir: Some("vits-piper-vi_VN-vais1000-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx',
tokens: 'vits-piper-vi_VN-vais1000-medium/tokens.txt',
dataDir: 'vits-piper-vi_VN-vais1000-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nước cũ đào gỗ mới, sông cũ chảy nước mới';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx',
tokens: 'vits-piper-vi_VN-vais1000-medium/tokens.txt',
dataDir: 'vits-piper-vi_VN-vais1000-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nước cũ đào gỗ mới, sông cũ chảy nước mới', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
lexicon: "",
tokens: "vits-piper-vi_VN-vais1000-medium/tokens.txt",
dataDir: "vits-piper-vi_VN-vais1000-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
tokens = "vits-piper-vi_VN-vais1000-medium/tokens.txt",
dataDir = "vits-piper-vi_VN-vais1000-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx");
vits.setTokens("vits-piper-vi_VN-vais1000-medium/tokens.txt");
vits.setDataDir("vits-piper-vi_VN-vais1000-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-vi_VN-vais1000-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-vi_VN-vais1000-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nước cũ đào gỗ mới, sông cũ chảy nước mới', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vais1000-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-vi_VN-vais1000-medium/vi_VN-vais1000-medium.onnx",
Tokens: "vits-piper-vi_VN-vais1000-medium/tokens.txt",
DataDir: "vits-piper-vi_VN-vais1000-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nước cũ đào gỗ mới, sông cũ chảy nước mới
sample audios for different speakers are listed below:
Speaker 0
vits-piper-vi_VN-vivos-x_low
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/vi/vi_VN/vivos/x_low
| Number of speakers | Sample rate |
|---|---|
| 65 | 16000 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx";
config.model.vits.tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-vi_VN-vivos-x_low
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
data_dir="vits-piper-vi_VN-vivos-x_low/espeak-ng-data",
tokens="vits-piper-vi_VN-vivos-x_low/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Nước cũ đào gỗ mới, sông cũ chảy nước mới",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx";
config.model.vits.tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt";
config.model.vits.data_dir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx".into()),
tokens: Some("vits-piper-vi_VN-vivos-x_low/tokens.txt".into()),
data_dir: Some("vits-piper-vi_VN-vivos-x_low/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx',
tokens: 'vits-piper-vi_VN-vivos-x_low/tokens.txt',
dataDir: 'vits-piper-vi_VN-vivos-x_low/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Nước cũ đào gỗ mới, sông cũ chảy nước mới';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx',
tokens: 'vits-piper-vi_VN-vivos-x_low/tokens.txt',
dataDir: 'vits-piper-vi_VN-vivos-x_low/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Nước cũ đào gỗ mới, sông cũ chảy nước mới', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
lexicon: "",
tokens: "vits-piper-vi_VN-vivos-x_low/tokens.txt",
dataDir: "vits-piper-vi_VN-vivos-x_low/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx";
config.Model.Vits.Tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
tokens = "vits-piper-vi_VN-vivos-x_low/tokens.txt",
dataDir = "vits-piper-vi_VN-vivos-x_low/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx");
vits.setTokens("vits-piper-vi_VN-vivos-x_low/tokens.txt");
vits.setDataDir("vits-piper-vi_VN-vivos-x_low/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Nước cũ đào gỗ mới, sông cũ chảy nước mới";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx';
Config.Model.Vits.Tokens := 'vits-piper-vi_VN-vivos-x_low/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-vi_VN-vivos-x_low/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Nước cũ đào gỗ mới, sông cũ chảy nước mới', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-vi_VN-vivos-x_low with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-vi_VN-vivos-x_low/vi_VN-vivos-x_low.onnx",
Tokens: "vits-piper-vi_VN-vivos-x_low/tokens.txt",
DataDir: "vits-piper-vi_VN-vivos-x_low/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Nước cũ đào gỗ mới, sông cũ chảy nước mới"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Nước cũ đào gỗ mới, sông cũ chảy nước mới
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Speaker 11
Speaker 12
Speaker 13
Speaker 14
Speaker 15
Speaker 16
Speaker 17
Speaker 18
Speaker 19
Speaker 20
Speaker 21
Speaker 22
Speaker 23
Speaker 24
Speaker 25
Speaker 26
Speaker 27
Speaker 28
Speaker 29
Speaker 30
Speaker 31
Speaker 32
Speaker 33
Speaker 34
Speaker 35
Speaker 36
Speaker 37
Speaker 38
Speaker 39
Speaker 40
Speaker 41
Speaker 42
Speaker 43
Speaker 44
Speaker 45
Speaker 46
Speaker 47
Speaker 48
Speaker 49
Speaker 50
Speaker 51
Speaker 52
Speaker 53
Speaker 54
Speaker 55
Speaker 56
Speaker 57
Speaker 58
Speaker 59
Speaker 60
Speaker 61
Speaker 62
Speaker 63
Speaker 64
supertonic-3-vi
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is supertonic 3 from https://huggingface.co/Supertone/supertonic-3
It supports 31 languages: en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi.
This page shows samples for Vietnamese (vi).
| Number of speakers | Sample rate |
|---|---|
| 10 | 24000 |
Speaker IDs
| sid | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with supertonic-3
import sherpa_onnx
import soundfile as sf
tts_config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(
duration_predictor="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
text_encoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vector_estimator="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
tts_json="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicode_indexer="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voice_style="sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
debug=False,
num_threads=2,
provider="cpu",
),
)
tts = sherpa_onnx.OfflineTts(tts_config)
gen_config = sherpa_onnx.GenerationConfig()
gen_config.sid = 0
gen_config.num_steps = 8
gen_config.speed = 1.0
gen_config.extra["lang"] = "vi"
audio = tts.generate("Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo", gen_config)
sf.write("test.wav", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with supertonic-3 with C API.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
const char *text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0;
gen_cfg.extra = "{\"lang\": \"vi\"}";
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg,
ProgressCallback, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with supertonic-3 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.supertonic.duration_predictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.model.supertonic.text_encoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.model.supertonic.vector_estimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.model.supertonic.vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.model.supertonic.tts_json = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.model.supertonic.unicode_indexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.model.supertonic.voice_style = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.model.num_threads = 2;
// If you don't want to see debug messages, please set it to 0
config.model.debug = 1;
std::string filename = "./test.wav";
std::string text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.num_steps = 8;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
gen_cfg.extra = R"({"lang": "vi"})";
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-supertonic.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-supertonic \
/tmp/test-supertonic.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-supertonic
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-supertonic.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with supertonic-3 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
supertonic: OfflineTtsSupertonicModelConfig {
duration_predictor: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx".into()),
text_encoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx".into()),
vector_estimator: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx".into()),
vocoder: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx".into()),
tts_json: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json".into()),
unicode_indexer: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin".into()),
voice_style: Some("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";
let gen_config = GenerationConfig {
sid: 0,
num_steps: 8,
speed: 1.0,
extra: Some(r#"{"lang": "vi"}"#.into()),
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with supertonic-3 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
supertonic: {
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
},
debug: true,
numThreads: 2,
provider: 'cpu',
},
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {lang: 'vi'},
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with supertonic-3 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(
durationPredictor: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx',
textEncoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx',
vectorEstimator: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx',
vocoder: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx',
ttsJson: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json',
unicodeIndexer: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin',
voiceStyle: 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
supertonic: supertonic,
numThreads: 2,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
numSteps: 8,
speed: 1.0,
extra: {'lang': 'vi'},
);
final audio = tts.generateWithConfig(text: 'Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with supertonic-3 with Swift API.
func run() {
let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(
durationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(supertonic: supertonic)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.numSteps = 8
genConfig.speed = 1.0
genConfig.extra = ["lang": "vi"]
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with supertonic-3 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Supertonic.DurationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx";
config.Model.Supertonic.TextEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx";
config.Model.Supertonic.VectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx";
config.Model.Supertonic.Vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx";
config.Model.Supertonic.TtsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json";
config.Model.Supertonic.UnicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin";
config.Model.Supertonic.VoiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin";
config.Model.NumThreads = 2;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
var tts = new OfflineTts(config);
var text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.NumSteps = 8;
genConfig.Speed = 1.0f;
genConfig.Extra = "{\"lang\": \"vi\"}";
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with supertonic-3 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
supertonic = OfflineTtsSupertonicModelConfig(
durationPredictor = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
textEncoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
vectorEstimator = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
vocoder = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
ttsJson = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
unicodeIndexer = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
voiceStyle = "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
),
numThreads = 2,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
numSteps = 8,
speed = 1.0f,
extra = mapOf("lang" to "vi"),
)
val audio = tts.generateWithConfigAndCallback(
text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with supertonic-3 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var supertonic = new OfflineTtsSupertonicModelConfig();
supertonic.setDurationPredictor("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx");
supertonic.setTextEncoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx");
supertonic.setVectorEstimator("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx");
supertonic.setVocoder("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx");
supertonic.setTtsJson("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json");
supertonic.setUnicodeIndexer("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin");
supertonic.setVoiceStyle("sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setSupertonic(supertonic);
modelConfig.setNumThreads(2);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
var tts = new OfflineTts(config);
var text = "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setNumSteps(8);
genConfig.setSpeed(1.0f);
genConfig.setExtra("{\"lang\": \"vi\"}");
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with supertonic-3 with Pascal API.
program test_supertonic;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Supertonic.DurationPredictor := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx';
Config.Model.Supertonic.TextEncoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx';
Config.Model.Supertonic.VectorEstimator := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx';
Config.Model.Supertonic.Vocoder := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx';
Config.Model.Supertonic.TtsJson := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json';
Config.Model.Supertonic.UnicodeIndexer := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin';
Config.Model.Supertonic.VoiceStyle := 'sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin';
Config.Model.NumThreads := 2;
Config.Model.Debug := True;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.NumSteps := 8;
GenConfig.Speed := 1.0;
GenConfig.Extra := '{"lang": "vi"}';
Audio := Tts.GenerateWithConfig('Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with supertonic-3 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Supertonic: sherpa.OfflineTtsSupertonicModelConfig{
DurationPredictor: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/duration_predictor.int8.onnx",
TextEncoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/text_encoder.int8.onnx",
VectorEstimator: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vector_estimator.int8.onnx",
Vocoder: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/vocoder.int8.onnx",
TtsJson: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/tts.json",
UnicodeIndexer: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/unicode_indexer.bin",
VoiceStyle: "sherpa-onnx-supertonic-3-tts-int8-2026-05-11/voice.bin",
},
NumThreads: 2,
Debug: true,
},
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo"
genConfig := sherpa.GenerationConfig{
Sid: 0,
NumSteps: 8,
Speed: 1.0,
Extra: `{"lang": "vi"}`,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
sample audios for different speakers are listed below:
Speaker 0
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 1
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 2
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 3
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 4
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 5
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 6
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 7
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 8
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Speaker 9
0
Xin chào thế giới.
1
Hôm nay bạn thế nào?
2
Bầu trời xanh và gió rất nhẹ.
3
Học máy giúp máy tính học từ dữ liệu.
4
Tổng hợp giọng nói chuyển văn bản thành âm thanh rõ ràng.
5
Học sinh đọc một câu chuyện ngắn trong thư viện.
6
Tàu bị trễ vì công việc bảo trì đường ray.
7
Các mô hình nhỏ chạy nhanh trên thiết bị cục bộ.
8
Trợ lý giọng nói hỗ trợ các công việc hằng ngày.
9
Việc đọc ổn định rất quan trọng cho câu ngắn và câu dài.
Welsh
This section lists text to speech models for Welsh.
vits-piper-cy_GB-bu_tts-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cy/cy_GB/bu_tts/medium
| Number of speakers | Sample rate |
|---|---|
| 7 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx";
config.model.vits.tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
data_dir="vits-piper-cy_GB-bu_tts-medium/espeak-ng-data",
tokens="vits-piper-cy_GB-bu_tts-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx";
config.model.vits.tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx".into()),
tokens: Some("vits-piper-cy_GB-bu_tts-medium/tokens.txt".into()),
data_dir: Some("vits-piper-cy_GB-bu_tts-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx',
tokens: 'vits-piper-cy_GB-bu_tts-medium/tokens.txt',
dataDir: 'vits-piper-cy_GB-bu_tts-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx',
tokens: 'vits-piper-cy_GB-bu_tts-medium/tokens.txt',
dataDir: 'vits-piper-cy_GB-bu_tts-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
lexicon: "",
tokens: "vits-piper-cy_GB-bu_tts-medium/tokens.txt",
dataDir: "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
tokens = "vits-piper-cy_GB-bu_tts-medium/tokens.txt",
dataDir = "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx");
vits.setTokens("vits-piper-cy_GB-bu_tts-medium/tokens.txt");
vits.setDataDir("vits-piper-cy_GB-bu_tts-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-cy_GB-bu_tts-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-cy_GB-bu_tts-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-cy_GB-bu_tts-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-cy_GB-bu_tts-medium/cy_GB-bu_tts-medium.onnx",
Tokens: "vits-piper-cy_GB-bu_tts-medium/tokens.txt",
DataDir: "vits-piper-cy_GB-bu_tts-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd
sample audios for different speakers are listed below:
Speaker 0
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
vits-piper-cy_GB-gwryw_gogleddol-medium
| Info about this model | Download the model | Android APK | C API | C++ API |
| Python API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is converted from https://huggingface.co/rhasspy/piper-voices/tree/main/cy/cy_GB/gwryw_gogleddol/medium
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
Model download address
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
C API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.vits.model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx";
config.model.vits.tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
vits=sherpa_onnx.OfflineTtsVitsModelConfig(
model="vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
data_dir="vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data",
tokens="vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
),
num_threads=1,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
sid=0,
speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C++ API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.vits.model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx";
config.model.vits.tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt";
config.model.vits.data_dir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-piper.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-piper /tmp/test-piper.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-piper
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-piper.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
vits: OfflineTtsVitsModelConfig {
model: Some("vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx".into()),
tokens: Some("vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt".into()),
data_dir: Some("vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
vits: {
model: 'vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx',
tokens: 'vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt',
dataDir: 'vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final vits = sherpa_onnx.OfflineTtsVitsModelConfig(
model: 'vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx',
tokens: 'vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt',
dataDir: 'vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
vits: vits,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Swift API.
func run() {
let vits = sherpaOnnxOfflineTtsVitsModelConfig(
model: "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
lexicon: "",
tokens: "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
dataDir: "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Vits.Model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx";
config.Model.Vits.Tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt";
config.Model.Vits.DataDir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
vits = OfflineTtsVitsModelConfig(
model = "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
tokens = "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
dataDir = "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var vits = new OfflineTtsVitsModelConfig();
vits.setModel("vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx");
vits.setTokens("vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt");
vits.setDataDir("vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setVits(vits);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Pascal API.
program test_piper;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Vits.Model := 'vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx';
Config.Model.Vits.Tokens := 'vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt';
Config.Model.Vits.DataDir := 'vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with vits-piper-cy_GB-gwryw_gogleddol-medium with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Vits: sherpa.OfflineTtsVitsModelConfig{
Model: "vits-piper-cy_GB-gwryw_gogleddol-medium/cy_GB-gwryw_gogleddol-medium.onnx",
Tokens: "vits-piper-cy_GB-gwryw_gogleddol-medium/tokens.txt",
DataDir: "vits-piper-cy_GB-gwryw_gogleddol-medium/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd"
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd
sample audios for different speakers are listed below: