kitten-mini-en-v0_1-fp16
| Info about this model | Download the model | Android APK | Python API | C API |
| C++ API | Rust API | Node.js API | Dart API | Swift API |
| C# API | Kotlin API | Java API | Pascal API | Go API |
| Samples |
Info about this model
This model is kitten-tts-mini-0.1 and it is from https://huggingface.co/KittenML/kitten-tts-mini-0.1
It supports only English.
| Number of speakers | Sample rate |
|---|---|
| 8 | 24000 |
Download the model
Click to expand
Model download address
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_1-fp16.tar.bz2
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v1.13.2
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Meaning of speaker suffix
| Suffix | Meaning |
|---|---|
| f | Female |
| m | Male |
speaker ID to speaker name (sid -> name)
The mapping from speaker ID (sid) to speaker name is given below:
| 0 - 1 | 0 -> expr-voice-2-m | 1 -> expr-voice-2-f |
| 2 - 3 | 2 -> expr-voice-3-m | 3 -> expr-voice-3-f |
| 4 - 5 | 4 -> expr-voice-4-m | 5 -> expr-voice-4-f |
| 6 - 7 | 6 -> expr-voice-5-m | 7 -> expr-voice-5-f |
speaker name to speaker ID (name -> sid)
The mapping from speaker name to speaker ID (sid) is given below:
| 0 - 1 | expr-voice-2-m -> 0 | expr-voice-2-f -> 1 |
| 2 - 3 | expr-voice-3-m -> 2 | expr-voice-3-f -> 3 |
| 4 - 5 | expr-voice-4-m -> 4 | expr-voice-4-f -> 5 |
| 6 - 7 | expr-voice-5-m -> 6 | expr-voice-5-f -> 7 |
Python API
Click to expand
Assume you have installed sherpa-onnx via
pip install sherpa-onnx
and you have downloaded the model from
You can use the following code to play with ./kitten-mini-en-v0_1-fp16
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
kitten=sherpa_onnx.OfflineTtsKittenModelConfig(
model="./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
voices="./kitten-mini-en-v0_1-fp16/voices.bin",
tokens="./kitten-mini-en-v0_1-fp16/tokens.txt",
data_dir="./kitten-mini-en-v0_1-fp16/espeak-ng-data",
),
num_threads=2,
),
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
audio = tts.generate(text="Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.", sid=0, speed=1.0)
sf.write("test.mp3", audio.samples, samplerate=audio.sample_rate)
C API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.kitten.model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
config.model.num_threads = 1;
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.c.
Then you can compile it with the following command:
gcc -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.c
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.kitten.model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
config.model.kitten.voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
config.model.kitten.tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
config.model.kitten.data_dir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
std::string filename = "./test.wav";
std::string text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared ..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-kitten.cc.
Then you can compile it with the following command:
g++ -std=c++17 -I /tmp/sherpa-onnx/shared/include -L /tmp/sherpa-onnx/shared/lib -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -o /tmp/test-kitten /tmp/test-kitten.cc
Now you can run
cd /tmp
# Assume you have downloaded the model and extracted it to /tmp
./test-kitten
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-kitten.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
kitten: OfflineTtsKittenModelConfig {
model: Some("./kitten-mini-en-v0_1-fp16/model.fp16.onnx".into()),
voices: Some("./kitten-mini-en-v0_1-fp16/voices.bin".into()),
tokens: Some("./kitten-mini-en-v0_1-fp16/tokens.txt".into()),
data_dir: Some("./kitten-mini-en-v0_1-fp16/espeak-ng-data".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with kitten-mini-en-v0_1-fp16 with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
async function createOfflineTtsAsync() {
const config = {
model: {
kitten: {
model: './kitten-mini-en-v0_1-fp16/model.fp16.onnx',
voices: './kitten-mini-en-v0_1-fp16/voices.bin',
tokens: './kitten-mini-en-v0_1-fp16/tokens.txt',
dataDir: './kitten-mini-en-v0_1-fp16/espeak-ng-data',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
};
return await sherpa_onnx.OfflineTts.createAsync(config);
}
async function main() {
const tts = await createOfflineTtsAsync();
const text = 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';
console.log('Number of speakers:', tts.numSpeakers);
console.log('Sample rate:', tts.sampleRate);
const start = Date.now();
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
const audio = await tts.generateAsync({
text,
generationConfig,
onProgress({samples, progress}) {
process.stdout.write(
`\rGenerating... ${(progress * 100).toFixed(1)}%`);
return true;
},
});
console.log('\nGeneration finished.');
const stop = Date.now();
const elapsedSeconds = (stop - start) / 1000;
const durationSeconds = audio.samples.length / audio.sampleRate;
const realTimeFactor = elapsedSeconds / durationSeconds;
console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');
console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,
realTimeFactor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(filename, {
samples: audio.samples,
sampleRate: audio.sampleRate,
});
console.log(`Saved to ${filename}`);
}
main().catch((err) => {
console.error('TTS failed:', err);
process.exit(1);
});
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(
model: './kitten-mini-en-v0_1-fp16/model.fp16.onnx',
voices: './kitten-mini-en-v0_1-fp16/voices.bin',
tokens: './kitten-mini-en-v0_1-fp16/tokens.txt',
dataDir: './kitten-mini-en-v0_1-fp16/espeak-ng-data',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
kitten: kitten,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Swift API.
func run() {
let kitten = sherpaOnnxOfflineTtsKittenModelConfig(
model: "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
voices: "./kitten-mini-en-v0_1-fp16/voices.bin",
tokens: "./kitten-mini-en-v0_1-fp16/tokens.txt",
dataDir: "./kitten-mini-en-v0_1-fp16/espeak-ng-data"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(kitten: kitten)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Kitten.Model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx";
config.Model.Kitten.Voices = "./kitten-mini-en-v0_1-fp16/voices.bin";
config.Model.Kitten.Tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt";
config.Model.Kitten.DataDir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
kitten = OfflineTtsKittenModelConfig(
model = "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
voices = "./kitten-mini-en-v0_1-fp16/voices.bin",
tokens = "./kitten-mini-en-v0_1-fp16/tokens.txt",
dataDir = "./kitten-mini-en-v0_1-fp16/espeak-ng-data",
),
numThreads = 1,
debug = true,
),
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var kitten = new OfflineTtsKittenModelConfig();
kitten.setModel("./kitten-mini-en-v0_1-fp16/model.fp16.onnx");
kitten.setVoices("./kitten-mini-en-v0_1-fp16/voices.bin");
kitten.setTokens("./kitten-mini-en-v0_1-fp16/tokens.txt");
kitten.setDataDir("./kitten-mini-en-v0_1-fp16/espeak-ng-data");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setKitten(kitten);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Pascal API.
program test_kitten;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Kitten.Model := './kitten-mini-en-v0_1-fp16/model.fp16.onnx';
Config.Model.Kitten.Voices := './kitten-mini-en-v0_1-fp16/voices.bin';
Config.Model.Kitten.Tokens := './kitten-mini-en-v0_1-fp16/tokens.txt';
Config.Model.Kitten.DataDir := './kitten-mini-en-v0_1-fp16/espeak-ng-data';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with kitten-mini-en-v0_1-fp16 with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Kitten: sherpa.OfflineTtsKittenModelConfig{
Model: "./kitten-mini-en-v0_1-fp16/model.fp16.onnx",
Voices: "./kitten-mini-en-v0_1-fp16/voices.bin",
Tokens: "./kitten-mini-en-v0_1-fp16/tokens.txt",
DataDir: "./kitten-mini-en-v0_1-fp16/espeak-ng-data",
},
NumThreads: 1,
Debug: true,
},
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
Friends fell out often because life was changing so fast.
The easiest thing in the world was to lose touch with someone.
sample audios for different speakers are listed below: