faroit / stempeg Goto Github PK

View Code? Open in Web Editor NEW

90.0 6.0 12.0 1.51 MB

Python I/O for STEM audio files

Home Page: https://faroit.github.io/stempeg

License: MIT License

Python 93.92% Shell 0.97% Mako 5.11%

stems multitrack native-instruments python ffmpeg

stempeg's People

Contributors

Stargazers

Watchers

Forkers

fossabot proudzhu bgamari moses1994 kuonanhong dankwartrustow axeldelafosse jim79 naba89 ma5onic noorbaignuroo

stempeg's Issues

OSX quicklook support

🥳 osx seems to support stem files and has a UI to select the stem right from the quicklook window:

However, in seems that is uses some specific metadata to read the stem track name. Currently I don't know how to do that with ffmpeg, but it would be great to find out if there is way to support this.

Reading is too slow

currently stempeg loads stems by

extracting stream-by-stream and convert to wav
load wav files and merge to tensor

For each stream extraction a os.call is used. For small files this could slows down stempeg loading significantly.

i think temporaryfile needs to closed before they are removed. I am running into permission error.

I am getting permission error at
os.remove(tmps[tmp_id].name) because this temporary file is used by a different application.

Added this line
tmps[tmp_id].close()
just before
os.remove(tmps[tmp_id].name) #line 124
in read.py
to resolve the issue.

Stems write - Format not recognised

Hello,

As you stated in the documentation the stems write doesn't always work well.
I am using this command with ffmpeg to create a STEM file:

ffmpeg -i ~/mix.wav -i ~drums.wav -i ~/vocals.wav -map 0 -map 1 -map 2 -c:a libfdk_aac -metadata:s:0 title=mix -metadata:s:1 title=drums -metadata:s:2 title=vocals ~/output.stem.mp4

I then tried to read it back using the musdb library and it works well.
I was wondering if this could be included in your library to finally make it work properly.

I unfortunately do not have much time to work more on this and ask for a pull request but I made a simple implementation if could be of any help.
Also check this homebrew-ffmpeg if the right codecs are not installed properly in the official ffmpeg distribution.

Publishing conda package on conda-forge

Hello, I just open this issue to inform you that I created and published a Conda package on conda-forge package.

The associated publishing PR is available at conda-forge/staged-recipes#9998 and a dedicated feedstock repository has been created at https://github.com/conda-forge/stempeg-feedstock.

If you want to be added as maintainer please tell me :).

Regard

check if ffmpeg and ffprobe is installed

this might prevent users from making errors like this

Add ffmpeg 5 tests

read_stems disregards out_type parameter

read_stems takes a parameter, out_type, which is then never used, and the output is always np.float64, even when this is undesired. I am under the impression that the type should be passed to sf.read on read.py#123.

warnings.warning() does not exist

Bug Description:
When using stempeg as part of musdb, I encountered the following error:

        stem_durations = np.array([t.shape[0] for t in stems])
        if not (stem_durations == stem_durations[0]).all():
>           warnings.warning("Stems differ in length and were shortend")
E           AttributeError: module 'warnings' has no attribute 'warning'

/usr/local/lib/python3.9/site-packages/stempeg/read.py:299: AttributeError

warning() does not exist after checking the warnings package.

Suggested Solution:
warnings.warning() -> warnings.warn() since warn() exists.

stem2wav

Where is stem2wav? How to install that? I see stempeg 0.1.3 does not have that.

add docs

given the scale of this project, https://pdoc3.github.io/pdoc/ seems a good choice

16 bit flac output conversion?

Is there a way to convert the 4 stem output files from the new Open-Unmix UMX using Stempeg to output 16 bit flac files instead of the 24 bit flac files I am currently getting using it?

Thank, Rog

Is this not working on windows?

import glob, os
import stempeg
import os.path

train_path = "path_to_train/"
os.chdir(train_path)
for file in glob.glob("*.stem.mp4"):
    file_path = train_path + file
    print(os.path.isfile(file_path))
    S, rate = stempeg.read_stems(file_path)

Even isfile returns true, read_stems throws 'FileNotFoundError: [WinError 2] '

Freeze when loading mp4 muli-stem file

I am using the musdb package and convert the mp4 files containing multiple audio sources to wave files, as shown here:

https://github.com/f90/Wave-U-Net/blob/master/Datasets.py#L132

But randomly during conversion (so with potentially any file), conversion just freezes forever. After interrupting the process I can read the following error:

Traceback (most recent call last):
File "/opt/local/pycharm/helpers/pydev/pydevd.py", line 1668, in
main()
File "/opt/local/pycharm/helpers/pydev/pydevd.py", line 1662, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/opt/local/pycharm/helpers/pydev/pydevd.py", line 1072, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/mnt/daten/PycharmProjects/Wave-U-Net/Training.py", line 326, in
@ex.automain
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/experiment.py", line 137, in automain
self.run_commandline()
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/experiment.py", line 260, in run_commandline
return self.run(cmd_name, config_updates, named_configs, {}, args)
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/experiment.py", line 209, in run
run()
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/run.py", line 221, in call
self.result = self.main_function(*args)
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/sacred/config/captured_function.py", line 46, in captured_function
result = wrapped(*args, **kwargs)
File "/mnt/daten/PycharmProjects/Wave-U-Net/Training.py", line 348, in dsd_100_experiment
dsd_train, dsd_test = Datasets.getMUSDB(model_config["musdb_path"]) # List of (mix, acc, bass, drums, other, vocal) tuples
File "/mnt/daten/PycharmProjects/Wave-U-Net/Datasets.py", line 149, in getMUSDB
vocal_audio = track.targets["vocals"].audio
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/musdb/audio_classes.py", line 113, in audio
audio = source.audio
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/musdb/audio_classes.py", line 47, in audio
filename=self.path, stem_id=self.stem_id
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/stempeg/read.py", line 91, in read_stems
FFinfo = FFMPEGInfo(filename)
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/stempeg/read.py", line 19, in init
self.json_info = read_info(self.filename)
File "/home/daniel/tf-env-waveunet/local/lib/python2.7/site-packages/stempeg/read.py", line 55, in read_info
out = sp.check_output(cmd)
File "/usr/lib/python2.7/subprocess.py", line 567, in check_output
process = Popen(stdout=PIPE, *popenargs, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1319, in _execute_child
data = _eintr_retry_call(os.read, errpipe_read, 1048576)
File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call
return func(*args)
KeyboardInterrupt

Process finished with exit code 1

It seems that the ffmpeg/ffprobe process that identifies the stems within the mp4 file never returns, or returns empty output, or sth of that sort, so that the stempeg library waits forever for a response at sp.check_output. It doesnt look like there is a timeout for waiting for the ffmpeg output either. Plus ffmpeg is called with -v error, maybe that is suppressing errors that we should react to?

Any idea of how to fix this?

switch from travis to github worflow

stempeg.read_stems got an unexpected keyword argument 'sample_rate' 'ffmpeg_format'

I have installed the package using pip but it's same it's not the right version or something
the version I have is '0.1.8' latest is '0.2' doesn't install with pip (note I'm using windows 10 os)

No wave written

Hi,

The write function does not write anything and no error message is shown. 

I found when I remove "-acodec libfdk_aac" in write function it works. Looking forward for fixing the issue!

Best wishes,

Qiuqiang

Native Instruments (Traktor) Format

Hi, I found 'The Easton Ellises - Falcon 69.stem.mp4' didn't load up as a Stems file in Traktor. Was that an intended output for stempeg?

Tests failing with wrong shapes

Hi author(s),

I'm trying to run the tests included in this package, but the assert statements on the shapes of the stems are failing. The tests expect a shape of (5, 265216, 2) but the file has a shape of (5, 267264, 2).

Is this a bug or have the files been updated without updating the tests?

Thanks!

Add a check for mono files

As pointed out in #19, when writing mono files using stempeg, the codec expects stereo files. However, there is no exception raised which is why the files will playback with double rate.

It would be better to check the number of channels ==2 before writing files.

add audio2stem cli

creating stems from several audio files should also be available from the commandline

Support reading from file-like objects

supporting file-like objects to read and decode in-memory data would be a useful enhancement.
There may be problems, as suggested here, though: kkroening/ffmpeg-python#292

Ffprobe command returns non-zero exit status 3221225478

I am running it on anaconda.
It seems to work perfectly on colab. However on anaconda it fails.

The behavior is weird as well. I ran the command on bash and it runs correctly.

I have a loop which runs through all the stem files and it breaks after executing random iterations giving the error stated below.
I believe this could be an multiprocessing issue.
Could it be that that file is already being used by another process?

File "", line 1, in
runfile('C:/Users/w1572032/.spyder-py3/temp.py', wdir='C:/Users/w1572032/.spyder-py3')

File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)

File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/w1572032/.spyder-py3/temp.py", line 28, in
t = np.copy(track.targets['vocals'].audio.T)

File "C:\ProgramData\Anaconda3\lib\site-packages\musdb\audio_classes.py", line 113, in audio
audio = source.audio

File "C:\ProgramData\Anaconda3\lib\site-packages\musdb\audio_classes.py", line 47, in audio
filename=self.path, stem_id=self.stem_id

File "C:\ProgramData\Anaconda3\lib\site-packages\stempeg\read.py", line 90, in read_stems
FFinfo = FFMPEGInfo(filename)

File "C:\ProgramData\Anaconda3\lib\site-packages\stempeg\read.py", line 19, in init
self.json_info = read_info(self.filename)

File "C:\ProgramData\Anaconda3\lib\site-packages\stempeg\read.py", line 54, in read_info
out = sp.check_output(cmd)

File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 336, in check_output
**kwargs).stdout

File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 418, in run
output=stdout, stderr=stderr)

CalledProcessError: Command '['ffprobe', 'C:\Users\w1572032\Desktop\musdb18\train\BigTroubles - Phantom.stem.mp4', '-v', 'error', '-print_format', 'json', '-show_format', '-show_streams']' returned non-zero exit status 3221225478.

Support seeking

ffmpeg supports seeking using time instance in seconds

e.g. -ss 10.0 -i input.mp4 -to 15.0 seeks to second 10 and outputs till second 15
e.g. -ss 10.0 -i input.mp4 -t 5.0 seeks to second 10 and outputs an excerpt of 5 second duration

A loading error in Win System.

my data has a name format like 'xxxx - xxxx.stem.m64'. but stempeg cannot reconginize the blank space before "-". So it will raise a error said there is no file.

The dataset actually is MUSDB18-7 set

I solve it by following codes which actually deletes the front blank space. But I hope there is a better way to solve it.

index = track_name.index('-')
track_name = track_name[:index-1] + track_name[index:]

Evaluate dropping soundfile

spleeter implements a pure-python ffmpeg adapter. It would be interesting to see if this one performs faster than the current tmpfile decoding and soundfile reading.

At the same time this would allow us to also read and write from arbitrary audio formats, supported by ffmpeg, hence enabling mp3 -> stems

ffmpeg -version contains letter in version string

Hey,

I don't know if this library is only supposed to work with certain ffmpeg versions and I didn't really dig to deep to be honest. Running ffmpeg -version shows me the following output:

ffmpeg version n4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 10.1.0 (GCC)
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-nvdec --enable-nvenc --enable-omx --enable-shared --enable-version3
libavutil      56. 51.100 / 56. 51.100
libavcodec     58. 91.100 / 58. 91.100
libavformat    58. 45.100 / 58. 45.100
libavdevice    58. 10.100 / 58. 10.100
libavfilter     7. 85.100 /  7. 85.100
libswscale      5.  7.100 /  5.  7.100
libswresample   3.  7.100 /  3.  7.100
libpostproc    55.  7.100 / 55.  7.100

As you can see, the version string contains the letter 'n' which does not fit the regex in __init__.py line 68 re.findall(r'ffmpeg version (\d+\.)?(\d+\.)?(\*|\d+)', hay). After replacing the regex with ffmpeg version \w?(\d+\.)?(\d+\.)?(\*|\d+) everything started to work. Didn't encounter any errors so far, so you maybe want to consider changing the regex.

I'm working on a Linux 5.9.10-arch1-1 machine.