Git Product home page Git Product logo

Comments (7)

guest271314 avatar guest271314 commented on July 22, 2024 1

The audio playback quality is sub-par when resampling from 48000 to 22050. What is the suggested procedure to produce quality audio without glitches, gaps, faster or slower rate frames when converting from WebCodecs AudioData to AudioBuffer for the purpose of breaking out of the hard-coded box of Chrome WebCodecs implementation?

webcodecs-serialize-to-json-deserialize-json.zip

from web-audio-api-v2.

guest271314 avatar guest271314 commented on July 22, 2024

I used OfflineAudioContext to resample the hard-coded 48000 sample rate and numberOfFrames, 2568 for firs and 2880 for remainder of t AudioData objects output by AudioDeocder

https://chromium.googlesource.com/chromium/src/+/49cf62132c057a79b093c8b5ab72f195cac447cc/media/audio/audio_opus_encoder.cc#32

// For Opus, we try to encode 60ms, the maximum Opus buffer, for quality
// reasons.
constexpr int kOpusPreferredBufferDurationMs = 60;

https://chromium.googlesource.com/chromium/src/+/49cf62132c057a79b093c8b5ab72f195cac447cc/media/audio/audio_opus_encoder.cc#58

// default preferred 48 kHz. If the input sample rate is anything else, we'll
// use 48 kHz.

something like

  const TARGET_FRAME_SIZE = 220;
  const TARGET_SAMPLE_RATE = 22050;
  // ...
  const config = {
    numberOfChannels: 1,
    sampleRate: 22050, // Chrome hardcodes to 48000
    codec: 'opus',
    bitrate: 16000,
  };
  encoder.configure(config);
  const decoder = new AudioDecoder({
    error(e) {
      console.error(e);
    },
    async output(frame) {
      ++chunk_length;
      const { duration, numberOfChannels, numberOfFrames, sampleRate } = frame;
      const size = frame.allocationSize({ planeIndex: 0 });
      const data = new ArrayBuffer(size);
      frame.copyTo(data, { planeIndex: 0 });
      const buffer = new AudioBuffer({
        length: numberOfFrames,
        numberOfChannels,
        sampleRate,
      });
      buffer.getChannelData(0).set(new Float32Array(data));
      // https://stackoverflow.com/a/27601521
      const oac = new OfflineAudioContext(
        buffer.numberOfChannels,
        buffer.duration * TARGET_SAMPLE_RATE,
        TARGET_SAMPLE_RATE
      );
      // Play it from the beginning.
      const source = new AudioBufferSourceNode(oac, {
        buffer,
      });
      oac.buffer = source;
      source.connect(oac.destination);
      source.start();
      const ab = (await oac.startRendering()).getChannelData(0);
      for (let i = 0; i < ab.length; i++) {
        if (channelData.length === TARGET_FRAME_SIZE) {
          const floats = new Float32Array(
            channelData.splice(0, TARGET_FRAME_SIZE)
          );
          decoderController.enqueue(floats);
        }
        channelData.push(ab[i]);
      }
      if (chunk_length === len) {
        if (channelData.length) {
          const floats = new Float32Array(TARGET_FRAME_SIZE);
          floats.set(channelData.splice(0, channelData.length));
          decoderController.enqueue(floats);
          decoderController.close();
          decoderResolve();
        }
      }
    },
  });

from web-audio-api-v2.

padenot avatar padenot commented on July 22, 2024

The current design direction is to be able to create AudioBuffer objects directly from typed arrays, and to allow AudioBuffer to internally used more data types than f32. For now, authors need to create an AudioBuffer of the same size, use AudioData.copyTo to copy to an intermediate ArrayBuffer, and then copy (with possible conversion) to the AudioBuffer. This is wasteful and not ergonomic.

Another design direction is to be able to get the memory of an AudioData, and directly construct an AudioBuffer from this memory, skipping all copies (w3c/webcodecs#287).

from web-audio-api-v2.

guest271314 avatar guest271314 commented on July 22, 2024

There are several issues.

  • AudioData at AudioDecoder.output is absolutely dissimilar from the input AudioData at AudioEncoder.encode - developers use codecs for compression, not for implementation restrictions. Might as well just use opusenc directly or in WASM form is we cannot do opusenc --raw-rate 22050 input.wav output.opus equivalent in AudioEncoder configuration - the options are ignored by the implementation. I can use Native Messaging, fetch() reliably, WebTransport far less reliably, to input stream from browser and get STDOUT from native application.
  • Further resampling and converting to data with less length to accommodate MediaStreamTrackGenerator, where the AudioData output from an OsciallatorNode connected to MediaStreamAudioDestionaNode processed with MediaStreamTrackProcessor is also dissimilar to WebCodecs AudioDecoder.decode() output at output callback - with the same input.
  • AudioWorklet results in less glitches than MediaStreamTrackGenerator - when the input is processed in a Worker then WebAssembly.Memory.grow() is used - because when multiple ReadableStreams are processing in parallel and piped through and to other streams on the same thread, one can take priority and result in glitches in initial playback until the input is completely read - however, the only way to get an AudioWorklet instance is via an Ecmascript module, which limits usage due to CSP, and AudioWorklet does not expose fetch() or WebTransport - thus, use single memory with ability to grow; WebAssembly collects garbage -calling MediaStreamTrackGenerator stop() can crash the tab.

In summary, there needs to be consistency between these burgeoning API's so that user-defined conversion is not necessary, those if the user decides to convert between AudioData and AudioBuffer "seamlessly"; WebCodecs has free reign to do whatever it wants - why would the decoder only output 48000 sample rate when I deliberately input 22050 sample rate, 1 channel to configuration? That is inviting user-defined conversion (issues).

from web-audio-api-v2.

guest271314 avatar guest271314 commented on July 22, 2024

I updated and tested the code using OfflineAudioContext a few hundred more times and compared to creating a WAV file using data from AudioData.copyTo()

// https://github.com/higuma/wav-audio-encoder-js
class WavAudioEncoder {
  constructor({ buffers, sampleRate, numberOfChannels }) {
    Object.assign(this, {
      buffers,
      sampleRate,
      numberOfChannels,
      numberOfSamples: 0,
      dataViews: [],
    });
  }
  setString(view, offset, str) {
    const len = str.length;
    for (let i = 0; i < len; i++) {
      view.setUint8(offset + i, str.charCodeAt(i));
    }
  }
  async encode() {
    const [{ length }] = this.buffers;
    const data = new DataView(
      new ArrayBuffer(length * this.numberOfChannels * 2)
    );
    let offset = 0;
    for (let i = 0; i < length; i++) {
      for (let ch = 0; ch < this.numberOfChannels; ch++) {
        let x = this.buffers[ch][i] * 0x7fff;
        data.setInt16(
          offset,
          x < 0 ? Math.max(x, -0x8000) : Math.min(x, 0x7fff),
          true
        );
        offset += 2;
      }
    }
    this.dataViews.push(data);
    this.numberOfSamples += length;
    const dataSize = this.numberOfChannels * this.numberOfSamples * 2;
    const view = new DataView(new ArrayBuffer(44));
    this.setString(view, 0, 'RIFF');
    view.setUint32(4, 36 + dataSize, true);
    this.setString(view, 8, 'WAVE');
    this.setString(view, 12, 'fmt ');
    view.setUint32(16, 16, true);
    view.setUint16(20, 1, true);
    view.setUint16(22, this.numberOfChannels, true);
    view.setUint32(24, this.sampleRate, true);
    view.setUint32(28, this.sampleRate * 4, true);
    view.setUint16(32, this.numberOfChannels * 2, true);
    view.setUint16(34, 16, true);
    this.setString(view, 36, 'data');
    view.setUint32(40, dataSize, true);
    this.dataViews.unshift(view);
    return new Blob(this.dataViews, { type: 'audio/wav' }).arrayBuffer();
  }
}
// ...
const wav = new WavAudioEncoder({
  sampleRate: 48000,
   numberOfChannels: 1,
   buffers: [new Float32Array(data)],
});
const ab = (await ac.decodeAudioData(await wav.encode())).getChannelData(0);

Glitches can occasionally occur in the beginning of the OfflineAudioContext playback. No glitches occur creating WAV headers and prepending the headers to the data. Test and compare the differences for yourself https://guest271314.github.io/webcodecs/.

Are these the simplest approaches to resample the output from AudioDecoder.decode()?

The important point is that it is only necessary to resample the data from AudioData at AudioDecoder.output becuase WebCodecs does not honor AudioEncoder or AudioDecoder configuration and resamples to 48000, and outputs numberOfFrames far greater than input numberOfFrames which is inconsistent behaviour.

If there was consistency between WebCodecs AudioEncoder.output and AudioDecoder.output with regard to AudioData there would be no need to resample with Web Audio API.

from web-audio-api-v2.

padenot avatar padenot commented on July 22, 2024

Two things:

  • This approach to resample segments of audio with an OfflineAudioContext cannot work. Non-naive audio resampling is a stateful operation, and creating a new OfflineAudioContext each time doesn't allow keeping any state. Resampling using an OfflineAudioContext only works if the entirety of the audio is resampled in one operation.

  • Resampling to another rate is not in the scope of Web Codecs. Web Codecs is just about decoding and encoding, and resampling the audio to play it out is expected for now, since there is no resampler object in the Web Platform yet. Opus always works in 48kHz internally, and by default always decodes to 48kz, so this is what you see in Web Codecs. For other codecs, you'll see that the rate is (usually) the rate of the input stream.

from web-audio-api-v2.

guest271314 avatar guest271314 commented on July 22, 2024

The problem is resampling is necessary based on WebCodecs output.

All you need do is test the output of AudioDecoder and try to pass that AudioData directly to a MediaStreamTrackGenerator. One of two outcomes currently exist without user-defined intervention:

I can do $ opusenc --raw-rate 22050 input.wav output.opus and get the output I set. WebCodecs ignores the configuration, yet claims "flexibility". Since you are citing 48kHz as the inflexible default for WebCodecs implementation of 'opus' you need to update your specification to state that unambiguously so that I no longer will expect the option I pass to be effectual.

Resampling is necessary with the output of WebCodecs AudioDecoder AudioData to outerh API's - without using setTimeout() and essentially guessing when the incompatible-with AudioData will end.

I suggest you folks actually test AudioDecoder => MediaStreamTrackGenerator, and stop claiming WebCodecs is "flexible" is you intend on restricting options available using opusenc and opusdec. I might as well just use opusenc and opusdec with fetch() or WebTransport.

from web-audio-api-v2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.