Speech Enhancement
Remove background noise from audio using a GTCRN (Global Token Channel Attention Recurrent Network) model. This is useful for cleaning up noisy recordings before transcription.
Source file
nodejs-addon-examples/test_offline_speech_enhancement_gtcrn.js
Code
1// Copyright (c) 2025 Xiaomi Corporation
2//
3// Offline speech enhancement (denoising) using a GTCRN model.
4//
5// Usage:
6// node speech_enhancement.js
7//
8const sherpa_onnx = require('sherpa-onnx-node');
9
10// Download models from
11// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models
12function createOfflineSpeechDenoiser() {
13 const config = {
14 model: {
15 gtcrn: {model: './gtcrn_simple.onnx'},
16 debug: true,
17 numThreads: 1,
18 },
19 };
20 return new sherpa_onnx.OfflineSpeechDenoiser(config);
21}
22
23const sd = createOfflineSpeechDenoiser();
24
25const waveFilename = './inp_16k.wav';
26const wave = sherpa_onnx.readWave(waveFilename);
27
28// run() accepts {samples, sampleRate, enableExternalBuffer} and returns
29// {samples, sampleRate}.
30const denoised = sd.run({
31 samples: wave.samples,
32 sampleRate: wave.sampleRate,
33 enableExternalBuffer: true
34});
35
36sherpa_onnx.writeWave(
37 './enhanced-16k.wav',
38 {samples: denoised.samples, sampleRate: denoised.sampleRate});
39
40console.log(`Saved to ./enhanced-16k.wav`);
How to run
Install the package:
npm install sherpa-onnx-node
Download the model and test file:
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav
Set the library path and run:
# macOS export DYLD_LIBRARY_PATH=$(npm root)/sherpa-onnx-node/lib:$DYLD_LIBRARY_PATH # Linux export LD_LIBRARY_PATH=$(npm root)/sherpa-onnx-node/lib:$LD_LIBRARY_PATH node speech_enhancement.js
Expected output
Saved to ./enhanced-16k.wav
Notes
OfflineSpeechDenoiserprocesses the entire audio file at once.run()accepts{samples, sampleRate, enableExternalBuffer}and returns{samples, sampleRate}.enableExternalBuffer: trueenables zero-copy buffer sharing.The output sample rate matches the input sample rate (16kHz in this example).
You can also use
dpdfnet_baseline.onnxas an alternative model.