Git Product home page Git Product logo

magic-mic's People

Contributors

gabcoh avatar learnedvector avatar matthewscholefield avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

magic-mic's Issues

Figure out how to handle rewinds in the module

The module is a pretty smooth loopback right now, but whenever pulseaudio logs a rewind there is a pop. This is probably because I haven't handled rewinds (I don't really even understand what they are right now). Rewinds need to be handled correctly.

Improve RPC

Right now the rpc interface is pretty confusing, brittle and poorly documented. I want to improve it. Right now I'm sort of using JSON-RPC but not really. I don't support concurrent/asynchronous requests and it's very prone to race conditions. The methods are poorly documented. I'd like to fix all of this in as easy a way as possible which would probably mean sticking mostly with the current implementation, but I'm definitely on the look out for a tiny rpc framework that would make this easier. That said, I do think that fixing these issues probably won't be too hard with the current implementation.

Pulseaudio can't load module when linked with denoiser

Right now the module can not load into pulseaudio when it is linked against the denosier. My guess is that it is some problem to do with dynamic linking, so I tried to build pytorch as a static library which I have not succeeded at yet. It also could be due to c++ name mangling not being handled properly somewhere.

I will try to update this issue with screenshots of the message when I get a chance.

Improve denoiser api

Right now its pretty naive. I don't think the feed/spew is the best interface. Too many buffers to keep track of. Better if it just processes one chunk of audio at a time, and keeps track of whatever additional context it needs. Should think about this before #36

Update to Tauri beta

We are using the alpha, but tauri is now in beta. Apparently its a pretty awesome change, so we should get on that.

E2E server tests

Try setting up some e2e tests on the server on various distros. Not exactly sure how to go about this but it's worth looking in to.

Look into module version checks

My default archlinux pulseaudio daemon has no problem loading a module with any PA_MODULE_VERSION set but my pulseaudio built from source refuses to load the module if the version is not MODULE_VERSION. I need to look into what is causing this (is it just a new feature?) and if it is a problem on actual platforms. If the version is checked on other platforms we can probably live patch the module based on pulseaudio --version but ideally that won't be necessary.

Implement GUI MVP

Starting out with electron-webpack, implementing Michael's figma design

Disable logging in production for now

For now we should just completely disable logging in prod. We can try to figure out a good way to maybe do some rotating log files in the future, but for now just have it off by default.
We should make sure you can renable them maybe with a MAGICMIC_LOGvariable

Real Bidirectional Communication

Right now we're following JSON-RPC pretty closely for our ui -> server interactions, but I'm pretty sure json-rpc does not allow the server to send notifications to the client. This would be useful for us because sometimes the server makes changes that need to be reflected in the ui. Right now that is implemented using polling, but it might be better to actually just have the server send updates when it needs to. The basic infrastructure is already in the server (VirtualMicUpdate) but the ui doesn't support it. To get the ui to suport it we would need to have some sort of rust control over an event emitter in the javascript. I don't think tauri alpha has anything like this but the tauri beta might, so this might be blocked on #53.

Redesign Audio Processor api

Right now its pretty naive. I don't think the feed/spew is the best interface. Too many buffers to keep track of. Better if it just processes one chunk of audio at a time, and keeps track of whatever additional context it needs. Should think about this before #36

Implement RunningSTD

checkout something like (this)[https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm]

Right now I'm just returning .2 which was the std of a random test file.

Latency problems but awesome

First of all, this is a terrific project! Very useful and easy to use.

In my first try all went ok (probably bacause the cpu consumption was low), but when I started a meeting using brave-browser the noise filter was disabled due to high latency. I've changed to lightweight filter but is not as good as yours is.

Amazing work, thank you for making it open source!

Clean up UI a little

A little too much space on the edges/not vertically centered and personally I think the text seems a little big (even tho it was big in the mockup)
image

Figure out distribution

I need to figure out which shared libraries to ship with and how to package them.

I think I can just put them in resources and set LD_LIBRARY_PATH appropriately.

Which shared libraries to ship is a different question. Maybe start with everything, see how big that is and slim down from there?

Integrate pipesource-mvp with GUI

This should be the last step before a real MVP. I'm thinking maybe use rpc (grpc), but I'm not sure yet. Need to do some more investigating on that front.
Whatever we do might be a good idea to have a non electron intermediary which implements the rpc (or whatever) api and then does the platform specific calls itself. Probably easiest to not deal with that stuff within electorn.

Is 16kb 16000 or 16384?

Find out what we're using. It might not actually matter, but I think it would be good to be on the same page everywhere (talking about denoiser and me).

Proof of concept based on builtin modules

Simplest route might be to use a null-sink or virtual-sink and its monitor as the mic. I think we need to use the pulseaudio api (rather than eg. portaudio) so we can choose our mic and sink.

Improve Latency

In pipesource-mvp there is a huge amount of latency. This is probably due in part to both denoiser and maybe something in pipesource app itself. Need to do more investigation to see if anything is coming from pipesource, but on the denoiser end here are some ideas to improve latency:

  • use prefilled audio buffer like in example python
  • always spew everything no matter what by zero padding

I'll add more info to this issue if I find more.

ConnectionRefused with RNNoise module

When running the RNNoise module, it consistently errors with:

[2021-04-29 10:17:18.535] [server] [info] Loading Audio Processor from /tmp/.mount_magic-2h02QS/usr/lib/magic-mic/native/runtime_libs/audioproc.so
[2021-04-29 10:17:18.536] [server] [error] Cannot load Audio Processor create symbol: /tmp/.mount_magic-2h02QS/usr/bin/server-x86_64-unknown-linux-gnu: undefined symbol: create
thread 'main' panicked at 'Failed to connect to socket; FIX THIS RACE CONDITON: Os { code: 111, kind: ConnectionRefused, message: "Connection refused" }', src/main.rs:118:10
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

Despite this, the Audo-AI module consistently works.

Version: eae88f0

What if Queue falls behind realtime?

Right now there is no consideration given in the module to what to do if the queue falls behind realtime. We need to check for this and respond accordingly (ie. drop samples, speed things up [by resampling?], or something else.

Set default input source correctly

Right now I think the source is set arbitrarily via pa_stream_connect_record(..., source, ...) where source is nullptr at startup (which pulseaudio defines as setting the source as the system pleases). In my system this sets it to my audio monitor (alsa_output.pci-0000_00_1f.3.analog-stereo.monitor). We should figure out how to set the default input to the default source on startup.

While super clunky (as with everything in pulseaudio xD), one option is to get this via pa_context_get_server_info which triggers a callback including the server info which includes the default source name:

void magic_mic_server_info_cb(pa_context *c, const pa_server_info *info, void *userdata) {
    (void) info->default_source_name;
}

Another option is to set source = "@DEFAULT_SOURCE@" on startup, but when the default source gets set to magic mic, then this would also change. So if we did this we would want to immediately read the current source and then reassign it to the actual name.

Alternatively, to ease this issue, we could simply hide monitors so in cases where users only have one non-monitor mic, the default is correct. For reference, the output of pactl list sources for me is:

Click to reveal
Source #0
	State: RUNNING
	Name: alsa_output.pci-0000_00_1f.3.analog-stereo.monitor
	Description: Monitor of Built-in Audio Analog Stereo
	Driver: module-alsa-card.c
	Sample Specification: s16le 2ch 48000Hz
	Channel Map: front-left,front-right
	Owner Module: 6
	Mute: no
	Volume: front-left: 63250 /  97% / -0.93 dB,   front-right: 63250 /  97% / -0.93 dB
	        balance 0.00
	Base Volume: 65536 / 100% / 0.00 dB
	Monitor of Sink: alsa_output.pci-0000_00_1f.3.analog-stereo
	Latency: 0 usec, configured 25000 usec
	Flags: DECIBEL_VOLUME LATENCY 
	Properties:
		device.description = "Monitor of Built-in Audio Analog Stereo"
		device.class = "monitor"
		alsa.card = "0"
		alsa.card_name = "HDA Intel PCH"
		alsa.long_card_name = "HDA Intel PCH at 0x94520000 irq 132"
		alsa.driver_name = "snd_hda_intel"
		device.bus_path = "pci-0000:00:1f.3"
		sysfs.path = "/devices/pci0000:00/0000:00:1f.3/sound/card0"
		device.bus = "pci"
		device.vendor.id = "8086"
		device.vendor.name = "Intel Corporation"
		device.product.id = "a171"
		device.product.name = "CM238 HD Audio Controller"
		device.form_factor = "internal"
		device.string = "0"
		module-udev-detect.discovered = "1"
		device.icon_name = "audio-card-pci"
	Formats:
		pcm

Source #1
	State: RUNNING
	Name: alsa_input.pci-0000_00_1f.3.analog-stereo
	Description: Built-in Audio Analog Stereo
	Driver: module-alsa-card.c
	Sample Specification: s16le 2ch 48000Hz
	Channel Map: front-left,front-right
	Owner Module: 6
	Mute: no
	Volume: front-left: 14389 /  22% / -39.51 dB,   front-right: 14389 /  22% / -39.51 dB
	        balance 0.00
	Base Volume: 6554 /  10% / -60.00 dB
	Monitor of Sink: n/a
	Latency: 8235 usec, configured 40000 usec
	Flags: HARDWARE HW_MUTE_CTRL HW_VOLUME_CTRL DECIBEL_VOLUME LATENCY 
	Properties:
		alsa.resolution_bits = "16"
		device.api = "alsa"
		device.class = "sound"
		alsa.class = "generic"
		alsa.subclass = "generic-mix"
		alsa.name = "ALC255 Analog"
		alsa.id = "ALC255 Analog"
		alsa.subdevice = "0"
		alsa.subdevice_name = "subdevice #0"
		alsa.device = "0"
		alsa.card = "0"
		alsa.card_name = "HDA Intel PCH"
		alsa.long_card_name = "HDA Intel PCH at 0x94520000 irq 132"
		alsa.driver_name = "snd_hda_intel"
		device.bus_path = "pci-0000:00:1f.3"
		sysfs.path = "/devices/pci0000:00/0000:00:1f.3/sound/card0"
		device.bus = "pci"
		device.vendor.id = "8086"
		device.vendor.name = "Intel Corporation"
		device.product.id = "a171"
		device.product.name = "CM238 HD Audio Controller"
		device.form_factor = "internal"
		device.string = "front:0"
		device.buffering.buffer_size = "352800"
		device.buffering.fragment_size = "176400"
		device.access_mode = "mmap+timer"
		device.profile.name = "analog-stereo"
		device.profile.description = "Analog Stereo"
		device.description = "Built-in Audio Analog Stereo"
		module-udev-detect.discovered = "1"
		device.icon_name = "audio-card-pci"
	Ports:
		analog-input-internal-mic: Internal Microphone (type: Mic, priority: 8900, availability unknown)
		analog-input-mic: Microphone (type: Mic, priority: 8700, not available)
	Active Port: analog-input-internal-mic
	Formats:
		pcm

Deal with pipes blocking in pipesource-mvp

The module-pipesource-pipe fills up and we block on it. This is bad for many reasons including its implications on latency but right now the main problem is that it interferes with signal handling. It seems like c++ io doesn't give us enough control over this so we need to use lower level io. Shouldn't be a problem, I just need to do it and I'm to tired right now.

Close to system tray

Ideally, I think we would want it so that closing the app actually minimizes to tray and right click on tray has a quit option.

Detect High Latency

If load gets high disable denoising and notify user using maybe something from here maybe.

We already detect high latency in the audio processor queue (which shouldn't even exist in the first place #48) and when load gets bad latency tends to accumualate in the recording stream queue which can be detected by pa_stream_get_readable_size

Fix Docker builds

Right now the dockerfile setup is pretty annoying for a few reasons:

  1. It relies on caching build stages for the build not to take forever. This works, but I'd rather not rely on it. Better if the caching were a bit more explicit and maybe a bit more extensive
  2. It only builds from git, so you can't really test things locally without pushing them. That is pretty horrible

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.