Git Product home page Git Product logo

audiofile's People

Contributors

christiangeng avatar damix48 avatar hagenw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

audiofile's Issues

Error when duration is set to 0.0

The following works:

x, sr = af.read('test.wav', duration=0)
x.shape
(0,)
x, sr = af.read('test.wav', duration=-1.0)
x.shape
(0,)

This not:

af.read('test.wav', duration=0.0)
TypeError: slice indices must be integers or None or have an __index__ method

Remove dependency on pysox

As an alternative to #65, we could simply remove the dependency on pysox
and call sox and soxi ourselves as we do with ffmpeg and mediainfo.
This would make us less dependent and avoid strange log warnings like:

       main  fw8h09ikjtb9  2022-04-14 11:59:23.880      If you do (or think that you should) have SoX, double-check your
       main  fw8h09ikjtb9  2022-04-14 11:59:23.880       - - - http://sox.sourceforge.net/ - - -
       main  fw8h09ikjtb9  2022-04-14 11:59:23.880      If you do not have SoX, proceed here:
       main  fw8h09ikjtb9  2022-04-14 11:59:23.880  SoX could not be found!
       main  fw8h09ikjtb9  2022-04-14 11:59:23.879  /bin/sh: 1: sox: not found 

Reading file may hang if offset is greater than file duration

I have a MP3 encoded stereo file of the following length:

>>> audiofile.duration(path)
3.996734693877551

Reading in the full file works:

>>> audiofile.read(path)
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]] 44100

Also reading with an offset of 3 seconds works:

>>> audiofile.read(path,   offset=3.0,)
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]] 44100

But when I try to read with an offset greater than the file duration it hangs instead of returning an empty array.

I tested with a WAV file and there it worked. So it seems to be related to reading encoded files with sox:

dtype argument not propagated to soundfile

When calling audiofile.read, the user has the option to pass optional **kwargs to soundfile.read. However, not all soundfile.read args are available, as some of them are already set by audiofile https://audeering.github.io/audiofile/_modules/audiofile/core/io.html#read. For example, trying to set dtype results in:

>>> x, fs = audiofile.read(file, dtype='int16')
x, fs = audiofile.read(file, dtype='int16')
  File "/home/audeering.local/atriant/envs/devaice/lib/python3.6/site-packages/audiofile/core/io.py", line 90, in read
    **kwargs,
TypeError: read() got multiple values for keyword argument 'dtype'

However, it is not transparent from the docs that dtype is set under the hood.

Note: I am not sure I need the option to override dtype (I am still searching for some better solution with audinterface than calling audiofile myself), but overall we should at least update the docs to tell the user what they can set and what not.

Allow outfile in convert_to_wav() to be None

For reference see #101 (comment).

When converting a file to WAV we might want to use the same path as the original file, but just store a WAV file, e.g.

audiofile.convert_to_wav('file.mp4', 'file.wav')

In this case we might think about just providing None as second argument:

audiofile.convert_to_wav('file.mp4', None)

or add None as default value to the outfile argument of audiofile.convert_to_wav(), then we could just do:

audiofile.convert_to_wav('file.mp4')

Fallback if sox is not available

Reading compressed files like mp3 can fail on systems where sox is not available. Since it is not always possible to install sox (e.g. on Google Cloud Functions) it would be good to have a fallback. For instance, librosa.load uses audioread if loading a file with sndfile fails.

Read audio with specific sampling rate and option for mixdown

Some libraries offer the option to set a target rate when reading an audio file, see e.g. librosa.load(). I wonder if we should also give the user the option to request a specific sampling rate and maybe also the option for a mixdown using our audresample package. This could be quite convenient for users not aware of aware of audresample .

Add return value to convert_to_wav()

As pointed out in #101 (comment) it might be useful to add the path (expanded by audeer.path()) to the created WAV file as a return value to convert_to_wav(), e.g.

path = audiofile.convert_to_wav('file.mp4', 'file.wav')

Reading with sox and offset returns wrong duration

If you read a file format, that is not WAV, FLAC, OGG, but supported by sox and you ask for offset without specifying duration, the duration of the returned file can be wrong.

In the tests an MP3 file is created with one or two channels and 48000 Hz sampling rate. When reading this using sox the returned duration is wrong only for the two channel case:

        sig, fs = af.read(mp3_file, offset=offset)
        assert_allclose(
            _duration(sig, sampling_rate),
            af.duration(mp3_file) - offset,
            rtol=0,              
>           atol=tolerance('duration', sampling_rate),
        )
E       AssertionError:
E       Not equal to tolerance rtol=0, atol=2.08333e-05
E
E       Mismatched elements: 1 / 1 (100%)  
E       Max absolute difference: 0.024      
E       Max relative difference: 0.02643172
E        x: array(0.884)
E        y: array(0.908)

Problem with files with incomplete header

A file provided in a project causes a problem with pyopensmile further down the line: ERROR: NULL pointer access.
@frankenjoe is familiar with this as well.
The output from soxi reveals that the file in question seems to be lacking Duration, File Size and Bit Rate:

(15:52)(~/xxx/) [py:projectenv] (master)
$ soxi data/recordings/<critical file>.wav
Input File     : 'data/recordings-all/<critical file>.wav'
Channels       : 1
Sample Rate    : 8000
Precision      : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

(15:52)(~/xxx/) [py:projectenv] (master)
$ soxi data/recordings/<normal file>.wav
Input File     : 'data/recordings-all/<normal file>.wav'
Channels       : 1
Sample Rate    : 8000
Precision      : 16-bit
Duration       : 00:00:40.00 = 319968 samples ~ 2999.7 CDDA sectors
File Size      : 640k
Bit Rate       : 128k
Sample Encoding: 16-bit Signed Integer PCM

Improve error message when trying to run the "duration" function and "mediainfo" is not installed

When running duration on an m4a (AAC) file, I get the following error:

af.duration(my_file)
Traceback (most recent call last):
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/sox/core.py", line 149, in soxi
    stderr=subprocess.PIPE
  File "/usr/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['sox', '--i', '-r', '/home/acrespi/Downloads/jellysplash.m4a']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/info.py", line 146, in sampling_rate
    return int(sox.file_info.sample_rate(file))
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/sox/file_info.py", line 205, in sample_rate
    output = soxi(input_filepath, 'r')
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/sox/core.py", line 153, in soxi
    raise SoxiError("SoXI failed with exit code {}".format(cpe.returncode))
sox.core.SoxiError: SoXI failed with exit code 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/info.py", line 104, in duration
    return samples(file) / sampling_rate(file)
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/info.py", line 149, in sampling_rate
    return int(run(cmd))
  File "/home/acrespi/code/sensai-web-api/sensai-service/modules/playtestcloud/venv/lib/python3.7/site-packages/audiofile/core/utils.py", line 17, in run
    stderr=subprocess.STDOUT
  File "/usr/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.7/subprocess.py", line 488, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.7/subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.7/subprocess.py", line 1551, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'mediainfo': 'mediainfo'

It is not very straightforward to understand that all I need to do is installing mediainfo in order to make this work (as stated in the installation instructions). A more explicit error message would be welcome. :)

Add sampling_rate argument to convert_to_wav()

In #99 we have added bit_depth as argument to audiofile.convert_to_wav() as internally we now use audiofile.write().

As there might be the use case of converting several files with different sampling rate to the same sampling rate,
we might also think about adding sampling_rate=None as argument to it.

Expand home directory and create directory if it does not exist

Currently, the following is not working:

signal = np.zeros((1, 16000), np.float32)
try:
    audiofile.write('/does/not/exist/test.wav', signal, 16000)
except Exception as ex:
    print(ex)
Error opening '/does/not/exist/test.wav': System error.

I usually find it convenient if I don't have to expand the filename or create the directory tree when I create a file.

Handle negative duration and offset values

As discussed in #109 (comment) we do not really have a consistent behavior when duration and/or offset are set to negative values in audiofile.read().

The are two possible solutions:

Ignore negative values

We could automatically set offset=0 and duration=0 if they are set to negative values.

Support negative values

We could interpret negative values to be counted from the end of the signal.
This would lead to the following behavior:

Suppose we have a 3 second long signal symbolized by ABC stored in file.wav. Here is what audiofile.read('file.wav', duration=duration, offset=offset) would return:

offset duration Returned signal
0 None ABC
0 1 A
1 1 B
1 None BC
-1 -1 B
-1 1 C
-2 None BC
0 -1 C
-1 None C
1 -1 A

So, in principle, it seems possible to support negative values. The combination of offset=0 and duration=-1 seems to be the most problematic case. One could argue that it should return an empty signal and only return C when offset=-0, but I don't like that solution.

No RuntimeError raised for missing file

In the documentation we have for most of the functions

image

But when I one on a non-existing file, I get:

>>> audiofile.("non-existent.wav")
...
LibsndfileError: Error opening '.../non-existing.wav': System error.

Benchmark partial file reading

As far as I see the benchmark only measures how long it takes to read full files. Might be interesting to also benchmark how long it takes for a library to partially read files using an offset and a duration.

Investigate if we can remove sox

As stated in #64 (comment) it might be easier if we don't depend on sox at all in audiofile.

The only thing we need to check is if reading MP3 files with ffmpeg and accessing MP3 metadata with mediafile is as fast as with sox.

Offset and duration in samples

Currently, offset and duration are provided as seconds and internally rounded to samples. In would be good to have the option to directly provide those arguments in samples, e.g. to read a long file in small chunks of certain length. We could introduce a unit argument as we have in audinterface.Feature.

Benchmark results

@hagenw nice library!

Looking at the benchmark results and the code, I wonder how audiofile is faster than soundfile on average when it wraps soundfile.read for reading?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.