patrickduncan / cleansio Goto Github PK

View Code? Open in Web Editor NEW

16.0 5.0 3.0 2.32 MB

Real-time music censoring - Capstone

License: MIT License

Python 95.94% Shell 2.42% Dockerfile 1.64%

audio audio-analysis audio-library censorship censor

cleansio's People

Contributors

Stargazers

Watchers

Forkers

levin-noro mross1080 awooooool

cleansio's Issues

Python 3.4 GCS issues on TravisCI

pip install --upgrade google-cloud-speech

throws error:

FileNotFoundError: [Errno 2] No such file or directory: '/home/travis/virtualenv/python3.4.6/lib/python3.4/site-packages/six-1.10.0.dist-info/METADATA'

Behaviour Diagram

I would like a detailed diagram of the main actions as of c5d310f. It would help greatly when new teammates are added to our project.

Please use https://draw.io and make it open for extension.

Research - Tempo (BPM)

Create a wiki page
Add the wiki page to the sidebar under "Lyric Accuracy Assessment"
Document how the song's beats per minute affects the accuracy of the lyrics. Based on this investigation we may need to either speed or slow down snippets.
Provide examples

Create PERT chart

PERT Charts or other project management technique, Continuous, with rotating deadlines per group: 10%

Overlapping Chunk

To improve the accuracy of #70, use an overlapping audio chunk.

Right now we're sequentially breaking up the audio file into 5 second chunks. The problem with this is the lyrics at the border between 2 audio chunks is cut off. By using another audio chunk that is 2.5 seconds ahead we can avoid this issue.

Create another chunk 2.5 seconds after the current chunk
Transcribe with GCS
Use that transcription to improve the accuracy of the current chunk
Discard the overlapping chunk

Windows support

Provide documentation on how to set up it on Windows. Assume they're using Windows 7+ and Bash

No Lyric Output

We shouldn't output lyrics at all, so please remove the code which does so.

Remove the print
Return the lyrics

Incorrect exit code in lint script

Script should exit with the same code as the pylint call.

Conversion fails for .m4a

Traceback (most recent call last):
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/grpc/_channel.py", line 532, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/grpc/_channel.py", line 466, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "Must use 16 bit samples for LINEAR_PCM, but WAV header indicates 32 bit per sample."
	debug_error_string = "{"created":"@1545427960.216637000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1099,"grpc_message":"Must use 16 bit samples for LINEAR_PCM, but WAV header indicates 32 bit per sample.","grpc_status":3}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "cleansio/cleansio.py", line 46, in <module>
    transcribe(AudioFile(sys.argv[1]))
  File "/Users/patrick/dev/cleansio/cleansio/speech/transcribe.py", line 14, in transcribe
    audio_file.sample_rate)
  File "/Users/patrick/dev/cleansio/cleansio/speech/transcribe.py", line 32, in transcribe_each_slice
    response = client.recognize(config, audio)
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/google/cloud/speech_v1/gapic/speech_client.py", line 227, in recognize
    request, retry=retry, timeout=timeout, metadata=metadata)
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/google/api_core/gapic_v1/method.py", line 139, in __call__
    return wrapped_func(*args, **kwargs)
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/google/api_core/retry.py", line 260, in retry_wrapped_func
    on_error=on_error,
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/google/api_core/retry.py", line 177, in retry_target
    return target()
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/google/api_core/timeout.py", line 206, in func_with_timeout
    return func(*args, **kwargs)
  File "/Users/patrick/anaconda3/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 61, in error_remapped_callable
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.InvalidArgument: 400 Must use 16 bit samples for LINEAR_PCM, but WAV header indicates 32 bit per sample.

Maximize Chunk Loudness

Based off #56, we need to increase the volume of the audio file to the threshold before clipping.

According to pydub's doc there's a relative loudness endpoint and it also supports modifying the loudness of a file

Prepare for Censor Process

Break up the censoring process into an import
Create a file/real-time mode
General helper module

Create Clean File

Concatenate (censored) audio chunks into a new file
Use Progress or tqdm

Convert to Mono

Check how many channels it has
Convert to Mono
- Google Cloud Speech only accepts 1 channel

Unit Tests

Add to CI
Unit test all functions that do not involve GCS.

Only Use Single Quotes

https://github.com/edaniszewski/pylint-quotes

Mute Explicits

Now that we've located where the explicits are, we need to ensure the user doesn't hear those words. Based on the timestamps mute sections of the audio chunk created by Cleansio and stored in ~/.cleansio-temp.

Mute sections based on timestamps
Overwrite the audio chunk
MUTE instead of BLEEP

Create Dockerfile

Research - Vocal Pitch

Create a wiki page
Add the wiki page to the sidebar under "Lyric Accuracy Assessment"
Document how the artist's pitch affects the accuracy of the lyrics
Provide examples

Friendly Output When File Doesn't Exist

Create Poster - Dec. 5

Due

GO/NoGO Poster, December 5th: 10%
Poster Tweets, December 31st: 10%

Responsibilities

Corie
- Background
- Censoring
Levin
- Process (writeup)
- Retrieving Lyrics
Patrick
- Design
- Process (diagram)
Victor
- Future Work: Real-time
- Acknowledgements

Accept Any Audio Format

~~Convert to FLAC on runtime~~
Create a temporary WAVE file if they input something other than FLAC/WAVE.

Delete the file after execution. Use system's temp.

Defaulting to WAVE because it yields higher GCS results.

Always cleanup files

If you exit the program early the temporary files will stay in the folder. Implement some type of finally block.

CONTRIBUTING file

Please create a file that details how a developer should go about contributing to our project. A step-by-step procedure.

Follow https://gist.github.com/PurpleBooth/b24679402957c63ec426, ours should be a fraction of the length!

Be sure to add stuff like if you add a library to the project document how others can install it in the README.

Word Timestamps

Now that we have a list of explicits, we need to locate where each word is sung in the 5 second audio chunk.

Locate the explicits as accurately as possible
Use milliseconds
Define a start and end timestamp (for each explicit)

Use Package Manager

Currently, we're installing each dependency separately. We should use a package manager that lists each library and the version to guarantee deterministic behaviour.

Use pipenv, poetry or requirements.txt
- Chose requirements.txt
Update README
Upate TravisCI
Update Dockerfile

Research - Genre

Create a wiki page
Add the wiki page to the sidebar under "Lyric Accuracy Assessment"
Document how the music genre affects the accuracy of the lyrics
Provide examples

No Length Cap

Transcribe an entire song by transcribing small 10~ second chunks

Research - Removing Instrumental

Create a wiki page
Add the wiki page to the sidebar under "Lyric Accuracy Assessment"
Document how the removing the instrumental affects the accuracy of the lyrics
Provide examples

Command-line Arguments

Use argparse

file_path is an argument
user list should be an argument
safe should be an argument

Improve Lyrics Output

Currently, we're just printing the JSON response. Display it like this:
09/34: I left my girl at home
10/34: I don't love her no more

Internet Usage

In real-time mode Google Cloud Speech will be constantly called over and over again in a very short period of time.

Monitor internet traffic while using Cleansio on a long audio file
Record download bandwidth
Record upload bandwidth
Create a wiki page with the results
Link to wiki page in this issue

Expand Vocal Pitch and BPM Research

Please add at least 3 more songs to https://github.com/PatrickDuncan/cleansio/wiki/Beats-Per-Minute-(BPM)-Research and https://github.com/PatrickDuncan/cleansio/wiki/Vocal-Pitch-Research

Class Structure

Break up the single script

Add License Badge

Explicit Word List Options

We need to combine the lists if the user provides one. We should also provide a way for the user to choose how the lists should be combined, or whether the user-provided list should override our internal one.

Check user-provided options to see what mode they want
Option to use just our list
Option to use just their list
Option to use a combination of both lists
If they want to combine the lists, put both lists into a set to remove duplicates, then create the new internal list of explicit words using that set's contents

Check Adjacent Words

It's possible for multiple chunks to be individually inoffensive, but be offensive when combined with each other. (For example, the letters in F-*-*-* said individually).

These should also be censored.

Logo

We need a logo for our project.

Timestamp Safe Mode

Use #84
Safe mode that widens the timestamp to decrease the chance that a word slips through the censor

Internal List of Explicits

We need to have an internal list of explicit words in case the user doesn't choose to provide their own list. This should be an encrypted YAML file.

Reference a reliable source for explicit words
Use a YAML file to list the explicits
Either encrypt the contents of the entire file or each explicit
- We're encrypting to avoid curse words to our repo

Crypto Suggestion

Research - Dynamics

Create a wiki page
Add the wiki page to the sidebar under "Lyric Accuracy Assessment"
Document how the loudness of a snippet affects the accuracy
Provide examples

Sample Rate

Find the sample rate of the file and change the GCS request instead of making it static.

If the sampling rate cannot be found convert the sample rate to a default value.

File Structure

Everything is currently in the root. Break it up following https://realpython.com/python-application-layouts/

Increase Sample Rate if Less Than 16000

GCS requires at least 16000

Remove String Interpolation

String interpolation is was added to Python on v3.6. We support v3.4-v3.6, remove all f"".

We definitely need unit tests.

Research - Snippet Length

Create a wiki page
Add the wiki page to the sidebar under "Lyric Accuracy Assessment"
Document how the length of the snippet affects the accuracy of the lyrics
Provide examples

Provide a cli arg to allow the user to specify where it should be stored
Provide a cli arg to allow the user to specify the audio encoding

User List of Explicits

We shouldn't use a hard-coded list. Please add code which loads in an encrypted YAML list of swear words before the rest of the program runs.

The file should either be in Cleansio's directory (gitignore the content) or they should provide the path to the YAML
There should be a sample YAML as a reference to the user
Pass the file through the command-line