This documentation covers the public native APIs shipped in:

These headers expose the main sherpa-onnx inference features for native applications and for language bindings that need a stable ABI.

What is documented here

The generated docs include the public APIs for:

Model-specific documentation

Each model family has its own documentation page with config examples:

Non-Streaming (Offline) ASR Models — Non-streaming ASR: Zipformer Transducer, Zipformer CTC, Whisper, SenseVoice, Paraformer, Moonshine, FireRedAsr, Dolphin, Canary, Cohere, WeNet, Omnilingual, FunASR Nano, Qwen3, MedASR, TeleSpeech, GigaAM v2, Parakeet TDT, NeMo CTC
Streaming (Online) ASR Models — Streaming ASR: Transducer (Zipformer), Nemotron, Paraformer, Zipformer2 CTC, T-One CTC
Text-to-Speech (TTS) Models — Text-to-Speech: Kokoro, VITS (Piper), Matcha, Kitten, ZipVoice, Pocket, Supertonic
Voice Activity Detection (VAD) — Voice Activity Detection: Silero VAD, Ten VAD
Audio Tagging — Audio Tagging: Zipformer, CED
Punctuation Restoration — Punctuation: Offline (CT-Transformer), Online (CNN-BiLSTM)
Speech Enhancement / Denoising — Speech Enhancement: GTCRN, DPDFNet (offline and online)
Source Separation — Source Separation: Spleeter, UVR
Offline Speaker Diarization — Speaker Diarization: Pyannote segmentation + embedding clustering
Speaker Embedding Extraction and Management — Speaker Embedding: extraction, enrollment, search, verification
Spoken Language Identification — Spoken Language Identification: Whisper-based
Keyword Spotting — Keyword Spotting: Zipformer KWS
Linear Resampler — Linear Resampler

The C API also includes HarmonyOS-specific constructor variants where applicable.

Use c-api.h if you are:

Use cxx-api.h if you are:

For the C API:

objects created by SherpaOnnxCreate*() are usually destroyed with a matching SherpaOnnxDestroy*()
result snapshots, returned strings, and returned arrays must be released with the specific matching free/destroy function documented on each API
some helpers return pointers to statically owned strings; those must not be freed

For the C++ API:

For both APIs, the usual flow is:

Start with:

Representative example programs live in:

Useful examples include:

Offline ASR (C API):

Streaming ASR (C API):

TTS (C API):

Other features (C API):

C++ API examples:

From sherpa-onnx/c-api/, run:

doxygen Doxyfile

HTML output is written to:

doxygen-docs/html/