Jupyter Notebook 100.00%

acoustic-representation-toolbox's People

Contributors

Stargazers

Watchers

Forkers

ababino gobbletown choyun1 jmcmeen thataustin man1207 baris-kuru

acoustic-representation-toolbox's Issues

Applying the techniques implemented in the macaque dataset to other animal vocalizations (dolphin)

Again, this might be hard without an optimized representation inverter, but we could try with spectrograms as a baseline.

Peter's amazing experiment #2

Add overview of the Acoustic Representation Toolbox to the README

I'd like to be able to link people to the https://github.com/earthspecies/representation-toolbox repo and have the ReadMe walk visitors through what the Acoustic Representation Toolbox is and why it matters, maybe with some examples, and also a description and exemplar for each of the types of representations. Audience is broader than just machine learners, can assume smart with some technical grasp. The overview should be broader and the examples of each type can be a little more technical.

https://github.com/earthspecies/representation-toolbox

Direction for the next 2 weeks on CPP

Testing the performance of the models on standard datasets like MusDB18 and/or WSJ0-2mix

Given the performance of our model on the macaque task, I’m very curious to see how it might do in baseline tests using datasets commonly used in the literature. This might be relevant since the lightweight nature of the UNet model might make it an attractive solution (compared to existing models) for animal recordings sampled at high sampling rates. With this in mind, we already have MusDB18 at our disposal, so this could be a good project as we decide which animal set to focus on.

Improving the representation inverter by exploring more advanced models and/or reaching out to some of the authors

It seems very strange to me that the methods work super well for conventional STFT spectrograms but not so well for other reps. One reason why representations might be important for bioacoustic applications is the high sampling rates required. For instance, many dolphin recordings are samples at 96kHz. While it might be possible to downsample a bit and still avoid aliasing, this would lead to an audio input with ~300k elements. However we could also try slicing audio inputs into more reasonable frame lengths and then concatenating results during inference. This I suppose is related to the “Variable Time Scales I’m Vocal Behavior” problem to some degree

earthspecies / acoustic-representation-toolbox Goto Github PK

acoustic-representation-toolbox's People

Contributors

Stargazers

Watchers

Forkers

acoustic-representation-toolbox's Issues

Applying the techniques implemented in the macaque dataset to other animal vocalizations (dolphin)

Peter's amazing experiment #2

Add overview of the Acoustic Representation Toolbox to the README

Direction for the next 2 weeks on CPP

Testing the performance of the models on standard datasets like MusDB18 and/or WSJ0-2mix

Improving the representation inverter by exploring more advanced models and/or reaching out to some of the authors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent