Git Product home page Git Product logo

ml-audio-classifier-example-for-pico's People

Contributors

ep1cman avatar sandeepmistry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ml-audio-classifier-example-for-pico's Issues

Can I use this approach to detect a certain set of spoken words?

Greetings!

I am trying to make an IoT device setup with a Raspberry Pi Pico and an Adafruit PDM MEMS Microphone Breakout. My goal is to detect some certain keywords in speech in real time. I have the data from the Speech Commands Dataset. Can I use this dataset and follow this Notebook make it detect keywords? If so, please give me some basic hints regarding where to change the code.

Thanks in advance ๐Ÿ™‚

What to modify for multi-label classification?

Hi, I just want to reach out to ask what to modify in the MLModel::predict() function if I plan to do some multi-label classification.
Particularly, I think I need to change these two lines of code in the ml_model.cpp.

float y_quantized = _output_tensor->data.int8[0];
float y = (y_quantized - _output_tensor->params.zero_point) * _output_tensor->params.scale;

The model I intend to use will output 4 numbers for the label probability (4 labels classification).
I wonder how could I modify _output_tensor->data to get the 4 numbers.

Inference not running with MAX9814

Hi @sandeepmistry,
Great work!
I am facing an issue.
I ran the colab notebook, it compiled and also created the inference etc. However, when I upload it to pico it does not detect anything and shows "not detected" even though I am only using the test audio.
For clarity, I am using MAX9814 Analog microphone with Raspberry pi pico on pin 26. I tried it a few months back and it was working fine then, but now it does not detect anything. Can you please help with this?
Thanks.
serial_output

Fantastic Project!

Hi Sandeep,

I'm really impressed by this project and the article you wrote about it. Its exactly what I've been looking for. I hope to leverage it to update my cat doorbell. I do have a couple questions, though.

  1. In the "Load wive file data" section, there is a dog barking example loaded. Is that necessary for the project? I'm not sure why its there, tbh. :)
  2. In the "Download datasets" section, I'm not sure what this construct for github means:
    'https://github.com/ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico/archive/refs/heads/fire_alarms.tar.gz',
    Did you add the tar file to github, delete it, then refer to it based on the archive? Never seen this construct before. Could you explain how that was done?

Thank you, sir.

How to match picotool for this demo

Hi,

I want to compile this example locally for some reason. I've finished compiling model, and want to flash it on pico board.

from colab_utils.pico import flash_pico

flash_pico('microphone-library-for-pico/build/examples/usb_microphone/usb_microphone.bin') 

I found that flash_pico function using google.colab service, I want to splite it. I find picotool

I want to compile this pico tool and flash it locally.

!git clone https://github.com/raspberrypi/picotool.git && cd picotool && mkdir build && cd build && cmake .. && make

output

Cloning into 'picotool'...
remote: Enumerating objects: 169, done.
remote: Counting objects: 100% (87/87), done.
remote: Compressing objects: 100% (27/27), done.
remote: Total 169 (delta 77), reused 60 (delta 60), pack-reused 82
Receiving objects: 100% (169/169), 113.09 KiB | 254.00 KiB/s, done.
Resolving deltas: 100% (99/99), done.
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
Using PICO_SDK_PATH from the environment ('/home/ycwang/pico-sdk')
CMake Error at CMakeLists.txt:23 (message):
  Raspberry Pi Pico SDK version 1.3.0 (or later) required.  Your version is
  1.2.0


-- Configuring incomplete, errors occurred!
See also "/home/ycwang/picotool/build/CMakeFiles/CMakeOutput.log".

It seems the pico-sdk version is lower than picotools, How can I solve this problem?

Can you tell me the steps, If I want to use picotool to flash bin and app locally?

Thanks.

Issue in loading custom dataset

When I run this cell (loading custom data set):

`custom_fire_alarm_ds = tf.data.Dataset.list_files("datasets/custom/fire_alarm/*.wav", shuffle=False)
custom_fire_alarm_ds = custom_fire_alarm_ds.map(lambda x: (x, 1, -1))
custom_fire_alarm_ds = custom_fire_alarm_ds.map(load_wav_for_map)
custom_fire_alarm_ds = custom_fire_alarm_ds.flat_map(split_wav_for_flat_map)
custom_fire_alarm_ds = custom_fire_alarm_ds.map(create_arm_spectrogram_for_map)

custom_background_noise_ds = tf.data.Dataset.list_files("datasets/custom/background_noise/*.wav", shuffle=False)
custom_background_noise_ds = custom_background_noise_ds.map(lambda x: (x, 0, -1))
custom_background_noise_ds = custom_background_noise_ds.map(load_wav_for_map)
custom_background_noise_ds = custom_background_noise_ds.flat_map(split_wav_for_flat_map)
custom_background_noise_ds = custom_background_noise_ds.map(create_arm_spectrogram_for_map)

custom_ds = tf.data.Dataset.concatenate(custom_fire_alarm_ds, custom_background_noise_ds)
custom_ds = custom_ds.map(lambda x, y, z: (tf.expand_dims(x, axis=-1), y, z))
custom_ds_len = calculate_ds_len(custom_ds)

print(f'{custom_ds_len}')

custom_ds = custom_ds.map(lambda x, y,z: (x, y))

custom_ds = custom_ds.shuffle(custom_ds_len).cache()`

then I get the following error:

`---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
in ()
13 custom_ds = tf.data.Dataset.concatenate(custom_fire_alarm_ds, custom_background_noise_ds)
14 custom_ds = custom_ds.map(lambda x, y, z: (tf.expand_dims(x, axis=-1), y, z))
---> 15 custom_ds_len = calculate_ds_len(custom_ds)
16
17 print(f'{custom_ds_len}')

4 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
7105 def raise_from_not_ok_status(e, name):
7106 e.message += (" name: " + name if name is not None else "")
-> 7107 raise core._status_to_exception(e) from None # pylint: disable=protected-access
7108
7109

InvalidArgumentError: Requires start <= limit when delta > 0: 0/-1
[[{{function_node __inference_split_wav_211}}{{node range}}]] [Op:IteratorGetNext]`

I also recorded some audios using the built-in microphone from my mac and it also gets saved successfully in the directory.

How to train for well-known sounds?

Hello again @sandeepmistry,

I'm trying to create a "cat doorbell" using your excellent work as a basis. I already have a working version on a full-sized Raspberry Pi 4. Now I want to shrink it even more and have it on a Pico W.

Since cat sounds are well understood and recorded everywhere, can't I just train the model using pre-defined sounds? Is there really any need for me to record anything myself?

If my assumption is correct, where would I get the cat sound samples? Is there anything publicly available?

Thank you.
-T

MAX9814 microphone module support

Hi there!
Great work! I would like to know that if I use Max9814 microphone module with Pico instead of MEMS microphone then what changes should be done in the code?

Failed to initialize DSP Pipeline!

Hi!
I have managed to change your project to use a custom analogue microphone. It works fine with various sample rates from 16000 up to 100,000 samples per second which is what I need.
However when I change the FFT_SIZE from 256 to something greater such as 512 or 1024 the program stops at "dsp_pipeline.init()".

if (arm_rfft_init_q15(&_S_q15, _fft_size, 0, 1) != ARM_MATH_SUCCESS) {
        return 0;
    }

However if I use the arduino sketch provided in your article https://medium.com/towards-data-science/fixed-point-dsp-for-data-scientists-d773a4271f7f#2960 it work well with 100000 samples per second at window size of 1024. Weird.

Could you pls help me with any idea?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.