On-device VAD + ASR
This page describes how to build SherpaOnnxVadAsr for on-device non-streaming speech recognition that runs on HarmonyOS.
Hint
This page is for non-streaming models.
This page is NOT for streaming models.
Open the project with DevEco Studio
You need to first download the code:
# Assume we place it inside /Users/fangjun/open-source
# You can place it anywhere you like.
cd /Users/fangjun/open-source/
git clone https://github.com/k2-fsa/sherpa-onnx
Then start DevEco Studio and follow the screenshots below:
data:image/s3,"s3://crabby-images/34baa/34baab126a4d2fcb57b69e11498a75763a4679fa" alt="Screenshot of starting DevEco"
Fig. 74 Step 1: Click Open
data:image/s3,"s3://crabby-images/a917e/a917eb9590a4aa3c06a4764f05e00ccc42075b36" alt="Screenshot of selecting SherpaOnnxVadAsr to open"
Fig. 75 Step 2: Select SherpaOnnxVadAsr inside the harmony-os folder and click Open
data:image/s3,"s3://crabby-images/b681e/b681eb5a27f58fc3f39f255f3e88901f42b0ce7d" alt="Screenshot of check version"
Fig. 76 Step 3: Check that it is using the latest version. You can visit sherpa_onnx to check available versions.
Download a VAD model
The first thing we have to do is to download the VAD model and put it inside the directory rawfile.
Caution
: The model MUST be placed inside the directory rawfile.
cd /Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/rawfile
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
Select a non-streaming ASR model
The code supports many non-streaming models from
and we have to modify the code to use the model that we choose.
Hint
You can try the above models at the following huggingface space:
We give two examples below about how to use the following two models:
Use sherpa-onnx-moonshine-tiny-en-int8
First, we download and unzip the model.
Caution
: The model MUST be placed inside the directory rawfile.
cd /Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/rawfile
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
# Remove unused files
rm -rf sherpa-onnx-moonshine-tiny-en-int8/test_wavs
Please check that your directory looks exactly
like the following at this point:
(py38) fangjuns-MacBook-Pro:rawfile fangjun$ pwd
/Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/rawfile
(py38) fangjuns-MacBook-Pro:rawfile fangjun$ ls -lh
total 3536
drwxr-xr-x 9 fangjun staff 288B Dec 6 15:42 sherpa-onnx-moonshine-tiny-en-int8
-rw-r--r-- 1 fangjun staff 1.7M Nov 28 18:13 silero_vad.onnx
(py38) fangjuns-MacBook-Pro:rawfile fangjun$ tree .
.
├── sherpa-onnx-moonshine-tiny-en-int8
│ ├── LICENSE
│ ├── README.md
│ ├── cached_decode.int8.onnx
│ ├── encode.int8.onnx
│ ├── preprocess.onnx
│ ├── tokens.txt
│ └── uncached_decode.int8.onnx
└── silero_vad.onnx
1 directory, 8 files
Now you should see the following inside DevEco Studio:
data:image/s3,"s3://crabby-images/1e8c5/1e8c5f7835fea1514b5bf7b54b58934e8d1aec2f" alt="Screenshot of sherpa-onnx-moonshine-tiny-en-int8 inside rawfile"
Fig. 77 Step 4: Check the model directory inside the rawfile
directory.
Now it is time to modify the code to use our model.
We need to change NonStreamingAsrWithVadWorker.ets.
data:image/s3,"s3://crabby-images/c8dbf/c8dbfb415b12829f90af684ab844dab2c2790fc6" alt="Screenshot of changing code for moonshine"
Fig. 78 Step 5: Change the code to use our selected model
Finally, we can build the project. See the screenshot below:
data:image/s3,"s3://crabby-images/1a097/1a09764f559a2c0d9b42a308e8fc25bc91aa788b" alt="Screenshot of changing code for moonshine"
Fig. 79 Step 6: Build the project
If you have an emulator, you can now start it.
data:image/s3,"s3://crabby-images/ce8b1/ce8b1e697e1ac89e7823e2335174c27cae8aca74" alt="Screenshot of selecting device manager"
Fig. 80 Step 7: Select the device manager
data:image/s3,"s3://crabby-images/e997b/e997b3bb244870ef9f79c9ddef57446f904b7a81" alt="Screenshot of starting the emulator"
Fig. 81 Step 8: Start the emulator
After the emulator is started, follow the screenshot below to run the app on the emulator:
data:image/s3,"s3://crabby-images/50181/501814964a6803263e1631c712c8329dbda7e094" alt="Screenshot of starting the app on the emulator"
Fig. 82 Step 9: Start the app on the emulator
You should see something like below:
data:image/s3,"s3://crabby-images/faca5/faca58282bee6cb8ff7e5021b9f19fddc37a1ca4" alt="Screenshot of app running on the emulator"
Fig. 83 Step 10: Click Allow to allow the app to access the microphone
data:image/s3,"s3://crabby-images/f4fa2/f4fa2d93ccbcaf88bfb2a307d6bcf612d5e741f8" alt="Screenshot of selecting a file for recognition"
Fig. 84 Step 11: Select a .wav file for recognition
data:image/s3,"s3://crabby-images/f6f62/f6f622685efe2d2662ba7dc852870660e087350c" alt="Screenshot of starting the microphone"
Fig. 85 Step 12: Start the microphone to record speech for recognition
Congratulations!
You have successfully run a on-device non-streaming speech recognition APP on HarmonyOS!
Use sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17
First, we download and unzip the model.
Caution
: The model MUST be placed inside the directory rawfile.
cd /Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/rawfile
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
# Remove unused files
rm -rf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs
rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx
Please check that your directory looks exactly
like the following at this point:
(py38) fangjuns-MacBook-Pro:rawfile fangjun$ pwd
/Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/rawfile
(py38) fangjuns-MacBook-Pro:rawfile fangjun$ ls
sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17 silero_vad.onnx
(py38) fangjuns-MacBook-Pro:rawfile fangjun$ ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/
total 493616
-rw-r--r-- 1 fangjun staff 71B Jul 18 21:06 LICENSE
-rw-r--r-- 1 fangjun staff 104B Jul 18 21:06 README.md
-rwxr-xr-x 1 fangjun staff 5.8K Jul 18 21:06 export-onnx.py
-rw-r--r-- 1 fangjun staff 228M Jul 18 21:06 model.int8.onnx
-rw-r--r-- 1 fangjun staff 308K Jul 18 21:06 tokens.txt
(py38) fangjuns-MacBook-Pro:rawfile fangjun$ tree .
.
├── sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17
│ ├── LICENSE
│ ├── README.md
│ ├── export-onnx.py
│ ├── model.int8.onnx
│ └── tokens.txt
└── silero_vad.onnx
1 directory, 6 files
Now you should see the following inside DevEco Studio:
data:image/s3,"s3://crabby-images/41bcd/41bcd796b476bb5d76994e0533155ad7b6de544c" alt="Screenshot of sense voice inside rawfile"
Fig. 86 Step 4: Check the model directory inside the rawfile
directory.
Now it is time to modify the code to use our model.
We need to change NonStreamingAsrWithVadWorker.ets.
data:image/s3,"s3://crabby-images/35d42/35d420df05571399eeffb6051652511c7236b503" alt="Screenshot of changing code for sense voice"
Fig. 87 Step 5-1: Change the code to use our selected model
data:image/s3,"s3://crabby-images/5f9c4/5f9c4b705a2cf9ff4bb5b7f94ac69165d9a588e3" alt="Screenshot of changing code for sense voice"
Fig. 88 Step 5-2: Change the code to use our selected model
Finally, we can build the project. See the screenshot below:
data:image/s3,"s3://crabby-images/1a097/1a09764f559a2c0d9b42a308e8fc25bc91aa788b" alt="Screenshot of changing code for moonshine"
Fig. 89 Step 6: Build the project
If you have an emulator, you can now start it.
data:image/s3,"s3://crabby-images/b1405/b140545723c85ad2aa8f2d648f8072c9df090e21" alt="Screenshot of selecting device manager"
Fig. 90 Step 7: Select the device manager
data:image/s3,"s3://crabby-images/e997b/e997b3bb244870ef9f79c9ddef57446f904b7a81" alt="Screenshot of starting the emulator"
Fig. 91 Step 8: Start the emulator
After the emulator is started, follow the screenshot below to run the app on the emulator:
data:image/s3,"s3://crabby-images/cc423/cc423cea05015ef15133be52eba5c3fde7ff1983" alt="Screenshot of starting the app on the emulator"
Fig. 92 Step 9: Start the app on the emulator
data:image/s3,"s3://crabby-images/e6773/e6773caebd962a2fb5ab355b9efb212f3edc4977" alt="Screenshot of app running on the emulator"
Fig. 93 Step 10: Click Allow to allow the app accessing the microphone
data:image/s3,"s3://crabby-images/74e32/74e32dc3a967e3408509c32a7dcee1e84fe35e63" alt="Screenshot of selecting a file for recognition"
Fig. 94 Step 11: Select a .wav file for recognition
data:image/s3,"s3://crabby-images/6ff52/6ff524a3f1c27bde9c9d66ec347ab6243a452956" alt="Screenshot of starting the microphone"
Fig. 95 Step 12: Start the microphone to record speech for recognition
Congratulations!
You have successfully run a on-device non-streaming speech recognition APP on HarmonyOS!