audeering / audiofile Goto Github PK

View Code? Open in Web Editor NEW

30.0 30.0 2.0 15.9 MB

Handling audio files in Python

Home Page: https://audeering.github.io/audiofile/

License: Other

Python 100.00%

audio read write

audiofile's People

Contributors

Stargazers

Watchers

Forkers

lorryrui1981 christiangeng

audiofile's Issues

Use libsndfile for reading/writing MP3

Since 1.1.0 libsndfile does now support reading and writing of MP3 files, see also libsndfile/libsndfile#258 and libsndfile/libsndfile#499.

So, we should try if we can speed up things for MP3 by directly using libsndfile.

Error when duration is set to 0.0

The following works:

x, sr = af.read('test.wav', duration=0)
x.shape

(0,)

x, sr = af.read('test.wav', duration=-1.0)
x.shape

(0,)

This not:

af.read('test.wav', duration=0.0)

TypeError: slice indices must be integers or None or have an __index__ method

Inconsistent behavior if "sloppy" flag when determining duration of empty audiofile

When determining the duration of an empty audiofile,

when using sloppy=False returns 0.0 (float)
whereas the code crashes.

This happens due to the conversion of "None" to "float".
Imo it would make sense to also return 0.0.

Pull Request and test demoing this suggestion will follow.

Windows tests fail for MP3 files

For unknown reasons we are not able to make ffmpeg work for MP3 files inside the Windows tests.
It fails when trying to convert the file, e.g. https://github.com/audeering/audiofile/pull/44/checks?check_run_id=3251320851

For other non SND files like MP4 converting works.

Remove dependency on pysox

As an alternative to #65, we could simply remove the dependency on pysox
and call sox and soxi ourselves as we do with ffmpeg and mediainfo.
This would make us less dependent and avoid strange log warnings like:

       main  fw8h09ikjtb9  2022-04-14 11:59:23.880      If you do (or think that you should) have SoX, double-check your
       main  fw8h09ikjtb9  2022-04-14 11:59:23.880       - - - http://sox.sourceforge.net/ - - -
       main  fw8h09ikjtb9  2022-04-14 11:59:23.880      If you do not have SoX, proceed here:
       main  fw8h09ikjtb9  2022-04-14 11:59:23.880  SoX could not be found!
       main  fw8h09ikjtb9  2022-04-14 11:59:23.879  /bin/sh: 1: sox: not found

Normalize argument ignored in convert_to_wav()

In #99 we switched to use audiofile.write() inside audiofile.convert_to_wav() and added the normalize argument to audiofile.convert_to_wav(), but forgot to pass it on to audiofile.write().

Reading file may hang if offset is greater than file duration

I have a MP3 encoded stereo file of the following length:

>>> audiofile.duration(path)
3.996734693877551

Reading in the full file works:

>>> audiofile.read(path)
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]] 44100

Also reading with an offset of 3 seconds works:

>>> audiofile.read(path,   offset=3.0,)
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]] 44100

But when I try to read with an offset greater than the file duration it hangs instead of returning an empty array.

I tested with a WAV file and there it worked. So it seems to be related to reading encoded files with sox:

dtype argument not propagated to soundfile

When calling audiofile.read, the user has the option to pass optional **kwargs to soundfile.read. However, not all soundfile.read args are available, as some of them are already set by audiofile https://audeering.github.io/audiofile/_modules/audiofile/core/io.html#read. For example, trying to set dtype results in:

>>> x, fs = audiofile.read(file, dtype='int16')
x, fs = audiofile.read(file, dtype='int16')
  File "/home/audeering.local/atriant/envs/devaice/lib/python3.6/site-packages/audiofile/core/io.py", line 90, in read
    **kwargs,
TypeError: read() got multiple values for keyword argument 'dtype'

However, it is not transparent from the docs that dtype is set under the hood.

Note: I am not sure I need the option to override dtype (I am still searching for some better solution with audinterface than calling audiofile myself), but overall we should at least update the docs to tell the user what they can set and what not.

Add examples to the docstring

At the moment we only have examples in the usage section, but not for the single docstrings.

Allow outfile in convert_to_wav() to be None

For reference see #101 (comment).

When converting a file to WAV we might want to use the same path as the original file, but just store a WAV file, e.g.

audiofile.convert_to_wav('file.mp4', 'file.wav')

In this case we might think about just providing None as second argument:

audiofile.convert_to_wav('file.mp4', None)

or add None as default value to the outfile argument of audiofile.convert_to_wav(), then we could just do:

audiofile.convert_to_wav('file.mp4')

Fallback if sox is not available

Reading compressed files like mp3 can fail on systems where sox is not available. Since it is not always possible to install sox (e.g. on Google Cloud Functions) it would be good to have a fallback. For instance, librosa.load uses audioread if loading a file with sndfile fails.

Read audio with specific sampling rate and option for mixdown

Some libraries offer the option to set a target rate when reading an audio file, see e.g. librosa.load(). I wonder if we should also give the user the option to request a specific sampling rate and maybe also the option for a mixdown using our audresample package. This could be quite convenient for users not aware of aware of audresample .

Add return value to convert_to_wav()

As pointed out in #101 (comment) it might be useful to add the path (expanded by audeer.path()) to the created WAV file as a return value to convert_to_wav(), e.g.

path = audiofile.convert_to_wav('file.mp4', 'file.wav')

Reading with sox and offset returns wrong duration

If you read a file format, that is not WAV, FLAC, OGG, but supported by sox and you ask for offset without specifying duration, the duration of the returned file can be wrong.

In the tests an MP3 file is created with one or two channels and 48000 Hz sampling rate. When reading this using sox the returned duration is wrong only for the two channel case:

        sig, fs = af.read(mp3_file, offset=offset)
        assert_allclose(
            _duration(sig, sampling_rate),
            af.duration(mp3_file) - offset,
            rtol=0,              
>           atol=tolerance('duration', sampling_rate),
        )
E       AssertionError:
E       Not equal to tolerance rtol=0, atol=2.08333e-05
E
E       Mismatched elements: 1 / 1 (100%)  
E       Max absolute difference: 0.024      
E       Max relative difference: 0.02643172
E        x: array(0.884)
E        y: array(0.908)

Problem with files with incomplete header

A file provided in a project causes a problem with pyopensmile further down the line: ERROR: NULL pointer access.
@frankenjoe is familiar with this as well.
The output from soxi reveals that the file in question seems to be lacking Duration, File Size and Bit Rate:

(15:52)(~/xxx/) [py:projectenv] (master)
$ soxi data/recordings/<critical file>.wav
Input File     : 'data/recordings-all/<critical file>.wav'
Channels       : 1
Sample Rate    : 8000
Precision      : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

(15:52)(~/xxx/) [py:projectenv] (master)
$ soxi data/recordings/<normal file>.wav
Input File     : 'data/recordings-all/<normal file>.wav'
Channels       : 1
Sample Rate    : 8000
Precision      : 16-bit
Duration       : 00:00:40.00 = 319968 samples ~ 2999.7 CDDA sectors
File Size      : 640k
Bit Rate       : 128k
Sample Encoding: 16-bit Signed Integer PCM

Improve error message when trying to run the "duration" function and "mediainfo" is not installed

When running duration on an m4a (AAC) file, I get the following error:

af.duration(my_file)
Traceback (most recent call last):
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/sox/core.py", line 149, in soxi
    stderr=subprocess.PIPE
  File "/usr/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['sox', '--i', '-r', '/home/acrespi/Downloads/jellysplash.m4a']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/info.py", line 146, in sampling_rate
    return int(sox.file_info.sample_rate(file))
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/sox/file_info.py", line 205, in sample_rate
    output = soxi(input_filepath, 'r')
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/sox/core.py", line 153, in soxi
    raise SoxiError("SoXI failed with exit code {}".format(cpe.returncode))
sox.core.SoxiError: SoXI failed with exit code 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/info.py", line 104, in duration
    return samples(file) / sampling_rate(file)
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/info.py", line 149, in sampling_rate
    return int(run(cmd))
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/utils.py", line 17, in run
    stderr=subprocess.STDOUT
  File "/usr/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.7/subprocess.py", line 488, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.7/subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.7/subprocess.py", line 1551, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'mediainfo': 'mediainfo'

It is not very straightforward to understand that all I need to do is installing mediainfo in order to make this work (as stated in the installation instructions). A more explicit error message would be welcome. :)

Add sampling_rate argument to convert_to_wav()

In #99 we have added bit_depth as argument to audiofile.convert_to_wav() as internally we now use audiofile.write().

As there might be the use case of converting several files with different sampling rate to the same sampling rate,
we might also think about adding sampling_rate=None as argument to it.

Expand home directory and create directory if it does not exist

Currently, the following is not working:

signal = np.zeros((1, 16000), np.float32)
try:
    audiofile.write('/does/not/exist/test.wav', signal, 16000)
except Exception as ex:
    print(ex)

Error opening '/does/not/exist/test.wav': System error.

I usually find it convenient if I don't have to expand the filename ~~or create the directory tree~~ when I create a file.

Handle negative duration and offset values

As discussed in #109 (comment) we do not really have a consistent behavior when duration and/or offset are set to negative values in audiofile.read().

The are two possible solutions:

Ignore negative values

We could automatically set offset=0 and duration=0 if they are set to negative values.

Support negative values

We could interpret negative values to be counted from the end of the signal.
This would lead to the following behavior:

Suppose we have a 3 second long signal symbolized by ABC stored in file.wav. Here is what audiofile.read('file.wav', duration=duration, offset=offset) would return:

offset	duration	Returned signal
0	None	ABC
0	1	A
1	1	B
1	None	BC
-1	-1	B
-1	1	C
-2	None	BC
0	-1	C
-1	None	C
1	-1	A

So, in principle, it seems possible to support negative values. The combination of offset=0 and duration=-1 seems to be the most problematic case. One could argue that it should return an empty signal and only return C when offset=-0, but I don't like that solution.

No RuntimeError raised for missing file

In the documentation we have for most of the functions

But when I one on a non-existing file, I get:

>>> audiofile.("non-existent.wav")
...
LibsndfileError: Error opening '.../non-existing.wav': System error.

Benchmark partial file reading

As far as I see the benchmark only measures how long it takes to read full files. Might be interesting to also benchmark how long it takes for a library to partially read files using an offset and a duration.

Investigate if we can remove sox

As stated in #64 (comment) it might be easier if we don't depend on sox at all in audiofile.

The only thing we need to check is if reading MP3 files with ffmpeg and accessing MP3 metadata with mediafile is as fast as with sox.

Offset and duration in samples

Currently, offset and duration are provided as seconds and internally rounded to samples. In would be good to have the option to directly provide those arguments in samples, e.g. to read a long file in small chunks of certain length. We could introduce a unit argument as we have in audinterface.Feature.

Support reading/writing wav files using non-standard codecs

We are encountering problems when we trying to read wav-files using codec like opus or riffs. For the time being we resorted to using audioread Gstreamer via PyGObject. Will these codecs be supported in the forseeable future?

Benchmark results

@hagenw nice library!

Looking at the benchmark results and the code, I wonder how audiofile is faster than soundfile on average when it wraps soundfile.read for reading?