matcha-icefall-zh-baker
| Info about this model | Download the model | HF Space | Android APK | Python API |
| C API | C++ API | Rust API | Node.js API | Dart API |
| Swift API | C# API | Kotlin API | Java API | Pascal API |
| Go API | Samples |
Info about this model
This model is trained using the code from https://github.com/k2-fsa/icefall/tree/master/egs/baker_zh/TTS/matcha
It supports only Chinese.
| Number of speakers | Sample rate |
|---|---|
| 1 | 22050 |
Download the model
Click to expand
You need to download the acoustic model and the vocoder model.
Download the acoustic model
Please use the following code to download the model:
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2
tar xvf matcha-icefall-zh-baker.tar.bz2
rm matcha-icefall-zh-baker.tar.bz2
You should see the following output:
ls -lh matcha-icefall-zh-baker/
total 150848
-rw-r--r--@ 1 fangjun staff 58K 6 Oct 08:39 date.fst
drwxr-xr-x@ 10 fangjun staff 320B 18 Feb 2025 dict
-rw-r--r--@ 1 fangjun staff 1.3M 6 Oct 08:39 lexicon.txt
-rw-r--r--@ 1 fangjun staff 72M 6 Oct 08:39 model-steps-3.onnx
-rw-r--r--@ 1 fangjun staff 63K 6 Oct 08:39 number.fst
-rw-r--r--@ 1 fangjun staff 87K 6 Oct 08:39 phone.fst
-rw-r--r--@ 1 fangjun staff 370B 6 Oct 08:39 README.md
-rw-r--r--@ 1 fangjun staff 19K 6 Oct 08:39 tokens.txt
Note: The
dictdirectory is no longer needed for this model.
Download the vocoder model
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx
You should see the following output
ls -lh vocos-22khz-univ.onnx
-rw-r--r--@ 1 fangjun staff 51M 17 Mar 2025 vocos-22khz-univ.onnx
Huggingface space
You can try this model by visiting https://huggingface.co/spaces/k2-fsa/text-to-speech
Huggingface space (WebAssembly, wasm)
You can try this model by visiting
https://huggingface.co/spaces/k2-fsa/web-assembly-zh-tts-matcha
The source code is available at https://github.com/k2-fsa/sherpa-onnx/tree/master/wasm/tts
Android APK
Click to expand
The following table shows the Android TTS Engine APK with this model for sherpa-onnx v
If you don’t know what ABI is, you probably need to select
arm64-v8a.
The source code for the APK can be found at
https://github.com/k2-fsa/sherpa-onnx/tree/master/android/SherpaOnnxTtsEngine
Please refer to the documentation for how to build the APK from source code.
More Android APKs can be found at
Python API
Click to expand
The following code shows how to use the Python API of sherpa-onnx with this model.
import sherpa_onnx
import soundfile as sf
config = sherpa_onnx.OfflineTtsConfig(
model=sherpa_onnx.OfflineTtsModelConfig(
matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(
acoustic_model="matcha-icefall-zh-baker/model-steps-3.onnx",
vocoder="vocos-22khz-univ.onnx",
lexicon="matcha-icefall-zh-baker/lexicon.txt",
tokens="matcha-icefall-zh-baker/tokens.txt",
),
num_threads=2,
debug=True, # set it False to disable debug output
),
max_num_sentences=1,
rule_fsts="matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
)
if not config.validate():
raise ValueError("Please check your config")
tts = sherpa_onnx.OfflineTts(config)
text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
audio = tts.generate(text, sid=0, speed=1.0)
sf.write(
"./test.mp3",
audio.samples,
samplerate=audio.sample_rate,
)
You can save it as test-zh.py and then run:
pip install sherpa-onnx soundfile
python3 ./test-zh.py
You will get a file test.mp3 in the end.
C API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker using C API.
#include <stdio.h>
#include <string.h>
#include "sherpa-onnx/c-api/c-api.h"
int main() {
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config));
config.model.matcha.acoustic_model = "matcha-icefall-zh-baker/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
config.model.matcha.lexicon = "matcha-icefall-zh-baker/lexicon.txt";
config.model.matcha.tokens = "matcha-icefall-zh-baker/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";
const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);
const char *text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
SherpaOnnxGenerationConfig gen_cfg;
memset(&gen_cfg, 0, sizeof(gen_cfg));
gen_cfg.sid = 0;
gen_cfg.speed = 1.0;
const SherpaOnnxGeneratedAudio *audio =
SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &gen_cfg, NULL, NULL);
SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,
"./test.wav");
// You need to free the pointers to avoid memory leak in your app
SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);
SherpaOnnxDestroyOfflineTts(tts);
printf("Saved to ./test.wav\n");
return 0;
}
In the following, we describe how to compile and run the above C example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-zh.c.
Then you can compile it with the following command:
gcc \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-zh \
/tmp/test-zh.c
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-zh.
Use static library (static link)
Please see the documentation at
C++ API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker using C++ API.
#include <cstdint>
#include <cstdio>
#include <string>
#include "sherpa-onnx/c-api/cxx-api.h"
static int32_t ProgressCallback(const float *samples, int32_t num_samples,
float progress, void *arg) {
fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
// return 1 to continue generating
// return 0 to stop generating
return 1;
}
int32_t main(int32_t argc, char *argv[]) {
using namespace sherpa_onnx::cxx; // NOLINT
OfflineTtsConfig config;
config.model.matcha.acoustic_model = "matcha-icefall-zh-baker/model-steps-3.onnx";
config.model.matcha.vocoder = "vocos-22khz-univ.onnx";
config.model.matcha.lexicon = "matcha-icefall-zh-baker/lexicon.txt";
config.model.matcha.tokens = "matcha-icefall-zh-baker/tokens.txt";
config.model.num_threads = 1;
// If you want to see debug messages, please set it to 1
config.model.debug = 0;
config.rule_fsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";
std::string filename = "./test.wav";
std::string text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
auto tts = OfflineTts::Create(config);
GenerationConfig gen_cfg;
gen_cfg.sid = 0;
gen_cfg.speed = 1.0; // larger -> faster in speech speed
#if 0
// If you don't want to use a callback, then please enable this branch
GeneratedAudio audio = tts.Generate(text, gen_cfg);
#else
GeneratedAudio audio = tts.Generate(text, gen_cfg, ProgressCallback);
#endif
WriteWave(filename, {audio.samples, audio.sample_rate});
fprintf(stderr, "Input text is: %s\n", text.c_str());
fprintf(stderr, "Speaker ID is: %d\n", gen_cfg.sid);
fprintf(stderr, "Saved to: %s\n", filename.c_str());
return 0;
}
In the following, we describe how to compile and run the above C++ example.
Use shared library (dynamic link)
cd /tmp
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build-shared
cd build-shared
cmake \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=/tmp/sherpa-onnx/shared \
..
make
make install
You can find required header file and library files inside /tmp/sherpa-onnx/shared.
Assume you have saved the above example file as /tmp/test-zh.cc.
Then you can compile it with the following command:
g++ \
-std=c++17 \
-I /tmp/sherpa-onnx/shared/include \
-L /tmp/sherpa-onnx/shared/lib \
-lsherpa-onnx-cxx-api \
-lsherpa-onnx-c-api \
-lonnxruntime \
-o /tmp/test-zh \
/tmp/test-zh.cc
Now you can run
cd /tmp
# Assume you have downloaded the acoustic model as well as the vocoder model and put them to /tmp
./test-zh
You probably need to run
# For Linux export LD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$LD_LIBRARY_PATH # For macOS export DYLD_LIBRARY_PATH=/tmp/sherpa-onnx/shared/lib:$DYLD_LIBRARY_PATHbefore you run
/tmp/test-zh.
Use static library (static link)
Please see the documentation at
Rust API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Rust API.
use sherpa_onnx::{
GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,
};
fn main() {
let config = OfflineTtsConfig {
model: sherpa_onnx::OfflineTtsModelConfig {
matcha: OfflineTtsMatchaModelConfig {
acoustic_model: Some("matcha-icefall-zh-baker/model-steps-3.onnx".into()),
vocoder: Some("vocos-22khz-univ.onnx".into()),
tokens: Some("matcha-icefall-zh-baker/tokens.txt".into()),
lexicon: Some("matcha-icefall-zh-baker/lexicon.txt".into()),
..Default::default()
},
num_threads: 2,
debug: false,
..Default::default()
},
rule_fsts: Some("matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst".into()),
..Default::default()
};
let tts = OfflineTts::create(&config).expect("Failed to create OfflineTts");
println!("Sample rate: {}", tts.sample_rate());
println!("Num speakers: {}", tts.num_speakers());
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
let gen_config = GenerationConfig {
sid: 0,
speed: 1.0,
..Default::default()
};
let audio = tts
.generate_with_config(
text,
&gen_config,
Some(|_samples: &[f32], progress: f32| -> bool {
println!("Progress: {:.1}%", progress * 100.0);
true
}),
)
.expect("Generation failed");
let filename = "./test.wav";
if audio.save(filename) {
println!("Saved to: {}", filename);
} else {
eprintln!("Failed to save {}", filename);
}
}
Please refer to the Rust API documentation for how to build and run the above Rust example.
Node.js (addon) API
Click to expand
You need to install the sherpa-onnx-node npm package first:
npm install sherpa-onnx-node
You can use the following code to play with matcha-icefall-zh-baker with the Node.js addon API.
const sherpa_onnx = require('sherpa-onnx-node');
function createOfflineTts() {
const config = {
model: {
matcha: {
acousticModel: 'matcha-icefall-zh-baker/model-steps-3.onnx',
vocoder: 'vocos-22khz-univ.onnx',
tokens: 'matcha-icefall-zh-baker/tokens.txt',
lexicon: 'matcha-icefall-zh-baker/lexicon.txt',
},
debug: true,
numThreads: 1,
provider: 'cpu',
},
maxNumSentences: 1,
ruleFsts: 'matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst',
};
return new sherpa_onnx.OfflineTts(config);
}
const tts = createOfflineTts();
const text = '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.';
const generationConfig = new sherpa_onnx.GenerationConfig({
sid: 0,
speed: 1.0,
silenceScale: 0.2,
});
let start = Date.now();
const audio = tts.generate({text, generationConfig});
let stop = Date.now();
const elapsed_seconds = (stop - start) / 1000;
const duration = audio.samples.length / audio.sampleRate;
const real_time_factor = elapsed_seconds / duration;
console.log('Wave duration', duration.toFixed(3), 'seconds');
console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');
console.log(
`RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,
real_time_factor.toFixed(3));
const filename = 'test.wav';
sherpa_onnx.writeWave(
filename, {samples: audio.samples, sampleRate: audio.sampleRate});
console.log(`Saved to ${filename}`);
Please refer to the Node.js addon API documentation for more details.
Dart API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Dart API.
import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;
void main() {
final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(
acousticModel: 'matcha-icefall-zh-baker/model-steps-3.onnx',
vocoder: 'vocos-22khz-univ.onnx',
tokens: 'matcha-icefall-zh-baker/tokens.txt',
lexicon: 'matcha-icefall-zh-baker/lexicon.txt',
);
final modelConfig = sherpa_onnx.OfflineTtsModelConfig(
matcha: matcha,
numThreads: 1,
debug: true,
);
final config = sherpa_onnx.OfflineTtsConfig(
model: modelConfig,
maxNumSenetences: 1,
);
final tts = sherpa_onnx.OfflineTts(config);
final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(
sid: 0,
speed: 1.0,
silenceScale: 0.2,
);
final audio = tts.generateWithConfig(text: '某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', config: genConfig);
tts.free();
sherpa_onnx.writeWave(
filename: 'test.wav',
samples: audio.samples,
sampleRate: audio.sampleRate,
);
print('Saved to test.wav');
}
Please refer to the Dart API documentation for more details.
Swift API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Swift API.
func run() {
let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(
acousticModel: "matcha-icefall-zh-baker/model-steps-3.onnx",
vocoder: "vocos-22khz-univ.onnx",
tokens: "matcha-icefall-zh-baker/tokens.txt",
dataDir: "",
lexicon: "matcha-icefall-zh-baker/lexicon.txt"
)
let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)
var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)
let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)
let text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
var genConfig = SherpaOnnxGenerationConfigSwift()
genConfig.sid = 0
genConfig.speed = 1.0
genConfig.silenceScale = 0.2
let audio = tts.generateWithConfig(text: text, config: genConfig, callback: nil, arg: nil)
let filename = "test.wav"
let ok = audio.save(filename: filename)
if ok == 1 {
print("Saved to \(filename)")
} else {
print("Failed to save \(filename)")
}
}
@main
struct App {
static func main() {
run()
}
}
Please refer to the Swift API documentation for more details.
C# API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with C# API.
using SherpaOnnx;
var config = new OfflineTtsConfig();
config.Model.Matcha.AcousticModel = "matcha-icefall-zh-baker/model-steps-3.onnx";
config.Model.Matcha.Vocoder = "vocos-22khz-univ.onnx";
config.Model.Matcha.Tokens = "matcha-icefall-zh-baker/tokens.txt";
config.Model.Matcha.Lexicon = "matcha-icefall-zh-baker/lexicon.txt";
config.Model.NumThreads = 1;
config.Model.Debug = 1;
config.Model.Provider = "cpu";
config.RuleFsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst";
config.MaxNumSentences = 1;
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();
genConfig.Sid = 0;
genConfig.Speed = 1.0f;
genConfig.SilenceScale = 0.2f;
var audio = tts.GenerateWithConfig(text, genConfig, null);
var ok = audio.SaveToWaveFile("./test.wav");
if (ok)
{
Console.WriteLine("Saved to ./test.wav");
}
else
{
Console.WriteLine("Failed to save ./test.wav");
}
Please refer to the C# API documentation for more details.
Kotlin API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Kotlin API.
package com.k2fsa.sherpa.onnx
fun main() {
var config = OfflineTtsConfig(
model = OfflineTtsModelConfig(
matcha = OfflineTtsMatchaModelConfig(
acousticModel = "matcha-icefall-zh-baker/model-steps-3.onnx",
vocoder = "vocos-22khz-univ.onnx",
tokens = "matcha-icefall-zh-baker/tokens.txt",
lexicon = "matcha-icefall-zh-baker/lexicon.txt",
),
numThreads = 1,
debug = true,
),
ruleFsts = "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
)
val tts = OfflineTts(config = config)
val genConfig = GenerationConfig(
sid = 0,
speed = 1.0f,
silenceScale = 0.2f,
)
val audio = tts.generateWithConfigAndCallback(
text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.",
config = genConfig,
callback = ::callback,
)
audio.save(filename = "test.wav")
tts.release()
println("Saved to test.wav")
}
fun callback(samples: FloatArray): Int {
// 1 means to continue
// 0 means to stop
return 1
}
Please refer to the Kotlin API documentation for more details.
Java API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Java API.
import com.k2fsa.sherpa.onnx.*;
public class TtsDemo {
public static void main(String[] args) {
var matcha = new OfflineTtsMatchaModelConfig();
matcha.setAcousticModel("matcha-icefall-zh-baker/model-steps-3.onnx");
matcha.setVocoder("vocos-22khz-univ.onnx");
matcha.setTokens("matcha-icefall-zh-baker/tokens.txt");
matcha.setLexicon("matcha-icefall-zh-baker/lexicon.txt");
var modelConfig = new OfflineTtsModelConfig();
modelConfig.setMatcha(matcha);
modelConfig.setNumThreads(1);
modelConfig.setDebug(true);
var config = new OfflineTtsConfig();
config.setModel(modelConfig);
config.setRuleFsts("matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst");
config.setMaxNumSentences(1);
var tts = new OfflineTts(config);
var text = "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.";
var genConfig = new GenerationConfig();
genConfig.setSid(0);
genConfig.setSpeed(1.0f);
genConfig.setSilenceScale(0.2f);
var audio = tts.generateWithConfigAndCallback(text, genConfig, (samples) -> {
// 1 means to continue, 0 means to stop
return 1;
});
audio.save("test.wav");
tts.release();
System.out.println("Saved to test.wav");
}
}
Please refer to the Java API documentation for more details.
Pascal API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Pascal API.
program test_matcha;
{$mode objfpc}
uses
SysUtils,
sherpa_onnx;
var
Config: TSherpaOnnxOfflineTtsConfig;
Tts: TSherpaOnnxOfflineTts;
Audio: TSherpaOnnxGeneratedAudio;
GenConfig: TSherpaOnnxGenerationConfig;
begin
FillChar(Config, SizeOf(Config), 0);
Config.Model.Matcha.AcousticModel := 'matcha-icefall-zh-baker/model-steps-3.onnx';
Config.Model.Matcha.Vocoder := 'vocos-22khz-univ.onnx';
Config.Model.Matcha.Tokens := 'matcha-icefall-zh-baker/tokens.txt';
Config.Model.Matcha.Lexicon := 'matcha-icefall-zh-baker/lexicon.txt';
Config.Model.NumThreads := 1;
Config.Model.Debug := True;
Config.RuleFsts := 'matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst';
Config.MaxNumSentences := 1;
Tts := TSherpaOnnxOfflineTts.Create(@Config);
GenConfig.Sid := 0;
GenConfig.Speed := 1.0;
GenConfig.SilenceScale := 0.2;
Audio := Tts.GenerateWithConfig('某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.', @GenConfig, nil);
WriteWave('./test.wav', Audio.Samples, Audio.N, Audio.SampleRate);
WriteLn('Saved to ./test.wav');
Audio.Free;
Tts.Free;
end.
Please refer to the Pascal API documentation for more details.
Go API
Click to expand
You can use the following code to play with matcha-icefall-zh-baker with Go API.
package main
import (
"fmt"
sherpa "github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx"
)
func main() {
config := sherpa.OfflineTtsConfig{
Model: sherpa.OfflineTtsModelConfig{
Matcha: sherpa.OfflineTtsMatchaModelConfig{
AcousticModel: "matcha-icefall-zh-baker/model-steps-3.onnx",
Vocoder: "vocos-22khz-univ.onnx",
Tokens: "matcha-icefall-zh-baker/tokens.txt",
Lexicon: "matcha-icefall-zh-baker/lexicon.txt",
},
NumThreads: 1,
Debug: true,
},
RuleFsts: "matcha-icefall-zh-baker/phone.fst,matcha-icefall-zh-baker/date.fst,matcha-icefall-zh-baker/number.fst",
MaxNumSentences: 1,
}
tts := sherpa.NewOfflineTts(&config)
defer tts.Delete()
text := "某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔."
genConfig := sherpa.GenerationConfig{
Sid: 0,
Speed: 1.0,
SilenceScale: 0.2,
}
audio := tts.GenerateWithConfig(text, &genConfig, nil)
filename := "./test.wav"
sherpa.WriteWave(filename, audio.Samples, audio.SampleRate)
fmt.Printf("Saved to %s\n", filename)
}
Please refer to the Go API documentation for more details.
Samples
For the following text:
某某银行的副行长和一些行政领导表示,他们去过长江和长白山; 经济不断增长。2024年12月31号,拨打110或者18920240511。123456块钱。当夜幕降临,星光点点,伴随着微风拂面,我在静谧中感受着时光的流转,思念如涟漪荡漾,梦境如画卷展开,我与自然融为一体,沉静在这片宁静的美丽之中,感受着生命的奇迹与温柔.
sample audios for different speakers are listed below: