ML Audio Classifier Example for Pico 🔊🔥🔔

License: Apache License 2.0

CMake 2.47% C++ 6.08% C 14.08% Python 5.94% Jupyter Notebook 71.43%

machine-learning raspberry-pi-pico rp2040 audio-classification

ml-audio-classifier-example-for-pico's People

Contributors

Stargazers

Watchers

ml-audio-classifier-example-for-pico's Issues

Can I use this approach to detect a certain set of spoken words?

Greetings!

I am trying to make an IoT device setup with a Raspberry Pi Pico and an Adafruit PDM MEMS Microphone Breakout. My goal is to detect some certain keywords in speech in real time. I have the data from the Speech Commands Dataset. Can I use this dataset and follow this Notebook make it detect keywords? If so, please give me some basic hints regarding where to change the code.

Thanks in advance 🙂

What to modify for multi-label classification?

Hi, I just want to reach out to ask what to modify in the MLModel::predict() function if I plan to do some multi-label classification.
Particularly, I think I need to change these two lines of code in the ml_model.cpp.

float y_quantized = _output_tensor->data.int8[0];
float y = (y_quantized - _output_tensor->params.zero_point) * _output_tensor->params.scale;

The model I intend to use will output 4 numbers for the label probability (4 labels classification).
I wonder how could I modify _output_tensor->data to get the 4 numbers.

What to modify for multi-label classification in Colab Notebook?

Hi can you please guide us on how to modify the colab notebook to make it work for multi-classification please?
TIA.

Inference not running with MAX9814

Hi @sandeepmistry,
Great work!
I am facing an issue.
I ran the colab notebook, it compiled and also created the inference etc. However, when I upload it to pico it does not detect anything and shows "not detected" even though I am only using the test audio.
For clarity, I am using MAX9814 Analog microphone with Raspberry pi pico on pin 26. I tried it a few months back and it was working fine then, but now it does not detect anything. Can you please help with this?
Thanks.

What's the TensorFlow version for this codelab

Hi,

What's the exact TensorFlow version for this code lab, I cannot run the code lab when I use TensorFlow version 2.12.0.

Fantastic Project!

Hi Sandeep,

I'm really impressed by this project and the article you wrote about it. Its exactly what I've been looking for. I hope to leverage it to update my cat doorbell. I do have a couple questions, though.

In the "Load wive file data" section, there is a dog barking example loaded. Is that necessary for the project? I'm not sure why its there, tbh. :)
In the "Download datasets" section, I'm not sure what this construct for github means:
'https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/archive/refs/heads/fire_alarms.tar.gz',
Did you add the tar file to github, delete it, then refer to it based on the archive? Never seen this construct before. Could you explain how that was done?

Thank you, sir.

How to match picotool for this demo

Hi,

I want to compile this example locally for some reason. I've finished compiling model, and want to flash it on pico board.

from colab_utils.pico import flash_pico

flash_pico('microphone-library-for-pico/build/examples/usb_microphone/usb_microphone.bin')

I found that flash_pico function using google.colab service, I want to splite it. I find picotool

I want to compile this pico tool and flash it locally.

!git clone https://github.com/raspberrypi/picotool.git && cd picotool && mkdir build && cd build && cmake .. && make

output

Cloning into 'picotool'...
remote: Enumerating objects: 169, done.
remote: Counting objects: 100% (87/87), done.
remote: Compressing objects: 100% (27/27), done.
remote: Total 169 (delta 77), reused 60 (delta 60), pack-reused 82
Receiving objects: 100% (169/169), 113.09 KiB | 254.00 KiB/s, done.
Resolving deltas: 100% (99/99), done.
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
Using PICO_SDK_PATH from the environment ('/home/ycwang/pico-sdk')
CMake Error at CMakeLists.txt:23 (message):
  Raspberry Pi Pico SDK version 1.3.0 (or later) required.  Your version is
  1.2.0


-- Configuring incomplete, errors occurred!
See also "/home/ycwang/picotool/build/CMakeFiles/CMakeOutput.log".

It seems the pico-sdk version is lower than picotools, How can I solve this problem?

Can you tell me the steps, If I want to use picotool to flash bin and app locally?

Thanks.

Issue in loading custom dataset

When I run this cell (loading custom data set):

`custom_fire_alarm_ds = tf.data.Dataset.list_files("datasets/custom/fire_alarm/*.wav", shuffle=False)
custom_fire_alarm_ds = custom_fire_alarm_ds.map(lambda x: (x, 1, -1))
custom_fire_alarm_ds = custom_fire_alarm_ds.map(load_wav_for_map)
custom_fire_alarm_ds = custom_fire_alarm_ds.flat_map(split_wav_for_flat_map)
custom_fire_alarm_ds = custom_fire_alarm_ds.map(create_arm_spectrogram_for_map)

custom_background_noise_ds = tf.data.Dataset.list_files("datasets/custom/background_noise/*.wav", shuffle=False)
custom_background_noise_ds = custom_background_noise_ds.map(lambda x: (x, 0, -1))
custom_background_noise_ds = custom_background_noise_ds.map(load_wav_for_map)
custom_background_noise_ds = custom_background_noise_ds.flat_map(split_wav_for_flat_map)
custom_background_noise_ds = custom_background_noise_ds.map(create_arm_spectrogram_for_map)

custom_ds = tf.data.Dataset.concatenate(custom_fire_alarm_ds, custom_background_noise_ds)
custom_ds = custom_ds.map(lambda x, y, z: (tf.expand_dims(x, axis=-1), y, z))
custom_ds_len = calculate_ds_len(custom_ds)

print(f'{custom_ds_len}')

custom_ds = custom_ds.map(lambda x, y,z: (x, y))

custom_ds = custom_ds.shuffle(custom_ds_len).cache()`

then I get the following error:

`---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
in ()
13 custom_ds = tf.data.Dataset.concatenate(custom_fire_alarm_ds, custom_background_noise_ds)
14 custom_ds = custom_ds.map(lambda x, y, z: (tf.expand_dims(x, axis=-1), y, z))
---> 15 custom_ds_len = calculate_ds_len(custom_ds)
16
17 print(f'{custom_ds_len}')

4 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
7105 def raise_from_not_ok_status(e, name):
7106 e.message += (" name: " + name if name is not None else "")
-> 7107 raise core._status_to_exception(e) from None # pylint: disable=protected-access
7108
7109

InvalidArgumentError: Requires start <= limit when delta > 0: 0/-1
[[{{function_node __inference_split_wav_211}}{{node range}}]] [Op:IteratorGetNext]`

I also recorded some audios using the built-in microphone from my mac and it also gets saved successfully in the directory.

test_wav_data is not working

Cell 14 in the google colab shows an error.

How to train for well-known sounds?

Hello again @sandeepmistry,

I'm trying to create a "cat doorbell" using your excellent work as a basis. I already have a working version on a full-sized Raspberry Pi 4. Now I want to shrink it even more and have it on a Pico W.

Since cat sounds are well understood and recorded everywhere, can't I just train the model using pre-defined sounds? Is there really any need for me to record anything myself?

If my assumption is correct, where would I get the cat sound samples? Is there anything publicly available?

Thank you.
-T

MAX9814 microphone module support

Hi there!
Great work! I would like to know that if I use Max9814 microphone module with Pico instead of MEMS microphone then what changes should be done in the code?

Training with your own audio with MAX9814 and Pico Wireless

I followed your instruction on the Colab. How can I train with my own audio with MAX9814?
and how can I use the USB feature with Pico Wireless ?

Failed to initialize DSP Pipeline!

Hi!
I have managed to change your project to use a custom analogue microphone. It works fine with various sample rates from 16000 up to 100,000 samples per second which is what I need.
However when I change the FFT_SIZE from 256 to something greater such as 512 or 1024 the program stops at "dsp_pipeline.init()".

if (arm_rfft_init_q15(&_S_q15, _fft_size, 0, 1) != ARM_MATH_SUCCESS) {
        return 0;
    }

However if I use the arduino sketch provided in your article https://medium.com/towards-data-science/fixed-point-dsp-for-data-scientists-d773a4271f7f#2960 it work well with 100000 samples per second at window size of 1024. Weird.

Could you pls help me with any idea?

armdeveloperecosystem / ml-audio-classifier-example-for-pico Goto Github PK

ml-audio-classifier-example-for-pico's People

Contributors

Stargazers

Watchers

Forkers

ml-audio-classifier-example-for-pico's Issues

Recommend Projects

Recommend Topics

Recommend Org