Describe the feature WebCodecs proposal is mentione

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Included: house--64kbs.opus (<a class="issue-link js-issue-lin

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Theoretically, since the pattern is initiated in this case by <code class="notranslate

Demonstration with 277MB WAV file <a href="https://plnkr.co/edit/nECtUZ?p=preview" re

Virtual F2F: Web Codec is low level and flexible and can encod

[meta] WebCodecs: Clearly define how Web Audio API input will connect to WebCodecs output,about webaudio/web-audio-api-v2

Comments (12)

JohnWeisz commented on August 24, 2024 2

Has OfflineAudioContext rendering output been considered as an input to WebCodecs?

See https://github.com/WebAudio/web-audio-api/issues/2138

from web-audio-api-v2.

guest271314 commented on August 24, 2024 1

@JohnWeisz The AudioBuffer model itself is an issue relevant to memory usage. A 3.5MB Opus file split and merged into a single file consumes at least 300MB of memory streaming the channel data to AudioWorklet from main thread. Executing disconnect() and exposing garbage collection and using self.gc() does not change the result. The total amount of Float32Array values is 140MB. See this comment at Is there a way to stop Web Audio API decodeAudioData method memory leak?

This is a reasonable workaround. But I agree that this shouldn't be needed. The browser should be able to collect these without any help from you as a long as you drop the references to the source buffers and the audio buffers. – Raymond Toy Feb 4 '19 at 16:52

Using MediaSource drastically reduces memory usage to 50MB for the same original Opus file written to a WebM container.

Given the PCM processing model am not sure how the overall memory usage can be reduced at all. What can be improved is garbage collection once use of the audio buffers is no longer needed.

If Web Audio API connects to WebCodecs via MediaStreamTrack that could potentially reduce memory usage - as long as the pattern is a direct to AudioContext.destination - and not further processing of Float32Arrays. However, to date we have no design pattern concept of how WebCodecs will connect to Web Audio API to experiment, and verify input and output, thus this issue.

from web-audio-api-v2.

guest271314 commented on August 24, 2024

Included:

house--64kbs.opus (xiph/opus-tools#49)
house_opus_merged

opus_files.zip

from web-audio-api-v2.

guest271314 commented on August 24, 2024

Improved result where these problems are observed

The problem is twofold

obvious gaps in playback (probably due to waiting on ended event of source buffer)

posting the data to AudioWorklet to avoid gaps requires substantial attention to the 128 byte per process() execution, where when the test is not correctly configured can crash the browser tab and the underlying OS.

using AudioWorklet by using subarray() to create 128 element Float32Arrays (without using WASM https://github.com/GoogleChromeLabs/web-audio-samples/tree/master/audio-worklet/design-pattern/wasm-ring-buffer).

Posting the code here now before crash the browser and OS again and potentially lose the tentatively working version

await ac.audioWorklet.addModule("audioWorklet.js")
  const aw = new AudioWorkletNode(ac, "audio-data-worklet-stream", {
        numberOfInputs: 2,
        numberOfOutputs:2,
        processorOptions: {
          buffers: {channel0:[], channel1:[]}
        }
      });
      aw.port.onmessage = async e => {
        console.log(e.data);
        if (e.data === "ended") {
          track.stop();
          track.enabled = false;
          await ac.suspend();
        }
      };

      aw.connect(msd);
      //
      const ab = await ac.decodeAudioData(uint8array.buffer);
      let [channel0, channel1] = [ab.getChannelData(0), ab.getChannelData(1)];
      aw.port.postMessage({channel0, channel1}, [channel0.buffer, channel1.buffer]);

class AudioDataWorkletStream extends AudioWorkletProcessor {
  constructor(options) {
    super();
    if (options.processorOptions) {
      Object.assign(this, options.processorOptions);
    }
    this.i = 0;
    this.resolve = void 0;
    this.promise = new Promise(resolve => this.resolve = resolve)
                   .then(_ => this.port.postMessage("ended"));
    this.port.onmessage = e => {
      const {
        channel0, channel1
      } = e.data;
      ++this.i;
      for (let i = 0; i < channel0.length; i += 128) {
        this.buffers.channel0.push(channel0.subarray(i, i + 128));
      }
      for (let i = 0; i < channel1.length; i += 128) {
        this.buffers.channel1.push(channel1.subarray(i, i + 128));
      }
    }
  }
  process(inputs, outputs) {

    if (this.i > 0 && this.buffers.channel0.length === 0) {
      this.resolve();
      globalThis.console.log(currentTime, currentFrame, this.buffers);
      return false;
    }
    for (let channel = 0; channel < outputs.length; ++channel) {
      const [outputChannel] = outputs[channel];
      let [inputChannel] = inputs[channel];
      if (this.i > 0 && this.buffers.channel0.length) {
        inputChannel = this.buffers[`channel${channel}`].shift();
      }
      outputChannel.set(inputChannel);
    }
    return true;
  }
}
registerProcessor("audio-data-worklet-stream", AudioDataWorkletStream);

However, the "glitches" or "gaps" between the merged files is still perceptible in some instances.

It should be possible to set AudioWorklet data from an input media resource, for example a ReadableStream - without having to split and merge files or use MediaSource or HTMLMediaElement - to playback media without first reading the entire file, rather, to playback the media while read the input, in parallel.

How will using WebCodecs with Web Audio API improve playback quality when the requirement is to use disparate and arbitrary media sources into a single, or multiple inputs to Web Audio API?

How is WebCodecs concretely conceptualized as being capable able to resolve the issue: to process partial media file input, arbitrary selection of parts of media, or a potentially infinite input stream via WebCodecs; where is the connection made to Web Audio API?

from web-audio-api-v2.

JohnWeisz commented on August 24, 2024

@guest271314

I totally agree AudioBuffer is not really an efficient model for audio data consumption, although I don't see a viable yet generally usable alternative either (that said, I'm working on a proof-of-concept streaming-based AudioBuffer alternative in my spare time, and the streaming of obvious formats is well supported, so there might be something in the near future I can help with hopefully).

from web-audio-api-v2.

guest271314 commented on August 24, 2024

Using MediaStream and MediaStreamTrack is a viable alternative for AudioBuffer usage which results in less memory usage that is observable. Exposing the media decoders and encoders that HTMLMediaElement uses (in Window and Worker scope) without using <audio> element could be substituted for MediaElementAudioSourceNode to avoid reliance on DOM. The only reason to use Web Audio API in such a case would be for piping the stream through to output an audio effect.

Since per the Explainer WebCodecs is focused on MediaStreamTracks

let decoded = await new AudioDecoder({codec: 'opus'});
let mediaStream = new MediaStream([decoded]); 
let webAudioConnection = new MediaStreamAudioSourceNode(ac, {mediaStream})

could be used to connect to Web Audio API, if necessary.

Am trying to gather how the specification authors who have suggested that WebCodecs will somehow solve issues posted at Web Audio API repositories are actually cconceptualizing a prospective design pattern. As yet have not located any documentation that can point to which provides evidence of any such concept, flow-chart, algorithms, or API connection architecture.

from web-audio-api-v2.

guest271314 commented on August 24, 2024

A proof-of-concept for a tentative WebCodecs-type connection to Web Audio API.

To preface, have tested the code at least hundreds of times locally serving a ~35MB WAV file (22.05 kHz, 2 channels) converted from Opus using

$ opusdec --rate 22050 --force-wav house--64kbs.opus output.wav

which can be downloaded from https://drive.google.com/file/d/19-28SYFjhHg_a5NqPG1GIqV_sy5iMQ5x/view. Have no way to test the code at other architectures or devices right now.

The goal of this code is to demonstrate how a WebCodecs-like API would connect to Web Audio API.

Specifically, the primary goal is to request any audio file having any codec and to parse (demux) and commence playback of that media resource during reading of the request - before the entire file has been downloaded - and for that audio playback to be "seamless" (still some work to do there, though "works" as a base example of the pattern).

The bulk of the work in this example is done in a Worker.

Kindly test the code at various devices, architectures to verify the output.

index.html

<!DOCTYPE html>
<html>
<head>
  <title>Stream audio from Worker to AudioWorklet</title>
</head>
<body>
  <audio autoplay controls></audio>
  <script>
    // AudioWorkletStream
    // Stream audio from Worker to AudioWorklet
    // guest271314 2-24-2020
    "use strict";
    if (gc) gc();
    const handleEvents = e => globalThis.console.log(e.type === "statechange" ? e.target.state : e.type);
    // TODO "seamless" playback; media fragment concatenation, processing; seekable streams
    const audio = document.querySelector("audio");
    // audio.ontimeupdate = e => console.log(audio.currentTime, ac.currentTime);
    const events = ["durationchange", "ended", "loadedmetadata"
    , "pause", "play", "playing", "suspend", "waiting"];
    for (let event of events) audio.addEventListener(event, handleEvents);

    class AudioWorkletStream {
      // options here can be exhaustive, e.g., providing means to connect multiple SharedWorkers
      // file handles, Blobs, ArrayBuffers, ReadableStream, WritableStream, HTMLMediaElement
      // for multiple audio, video, images, text tracks, et al., media stream input, output processing
      constructor({
        codecs = ["audio/wav", "wav22k2ch"]
        , urls = ["http://localhost:8000?wav=true"]
        , sampleRate = 44100
        , numberOfChannels = 2
        , latencyHint = "playback"
        , workletOptions = {}
      } = {}) {
        this.mediaStreamTrack = new Promise(async resolve => {
          gc();
          const ac = new AudioContext({
            sampleRate,
            numberOfChannels,
            latencyHint
          });
          await ac.suspend();
          ac.onstatechange = handleEvents;
          await ac.audioWorklet.addModule("audioWorklet.js");
          const aw = new AudioWorkletNode(ac, "audio-data-worklet-stream", workletOptions);
          const msd = new MediaStreamAudioDestinationNode(ac);
          const {
            stream
          } = msd;
          // MediaStreamTrack, kind "audio"
          const [track] = stream.getAudioTracks();
          // fulfill Promise with MediaStreamTrack
          resolve(track);
          // set enabled to false
          track.enabled = false;
          aw.connect(msd);
          // inactive event at Chromium 81
          // not longer defined in Media Capture and Streams specification
          // Firefox does not yet fully support AudioWorklet
          stream.oninactive = stream.onactive = handleEvents;
          // not dispatched at Chromium 81
          track.onmute = track.onunmute = track.onended = handleEvents;
          // transfer sources, here, e.g., file handle
          // ReadableStream, WritableStream (https://github.com/whatwg/streams/blob/master/transferable-streams-explainer.md)
          // etc.
          const worker = new Worker("worker.js", {
            type: "module"
          });
          worker.postMessage({
            port: aw.port,
            codecs,
            urls
          }, [aw.port]);
          worker.onmessage = async e => {
            // use suspend(), resume() to synchronize to degree possible
            // with currentTime of <audio> with MediaStream set as srcObject
            if (e.data.start) {
              track.enabled = true;
              await ac.resume();
            }
            if (e.data.ended) {
              track.stop();
              track.enabled = false;
              await ac.suspend();
              msd.disconnect();
              aw.disconnect();
              aw.port.close();
              worker.terminate();
              await ac.close();
              for (let event of events) audio.removeEventListener(event, handleEvents);
              if (gc) gc(); 
              // currentTime(s)    
              console.log({
                audio: audio.currentTime
              , currentTime: e.data.currentTime
              , currentFrame: e.data.currentFrame
              , minutes: Math.floor(e.data.currentTime / 60)
              , seconds: ((e.data.currentTime / 60) - Math.floor(e.data.currentTime / 60)) * 60
              });
            };
          };
        });
      };
    };

    (async() => {
      // set parameters as arrays for potential "infinite" input, output stream
      let workletStream = new AudioWorkletStream({
        // multiple codecs
        codecs: ["audio/wav", "wav22k2ch"]
          // multiple URL's
        , urls: ["http://localhost:8000?wav=true"]
        , sampleRate: 22050
        , numberOfChannels: 2
        , latencyHint: "playback"
        , workletOptions: {
            numberOfInputs: 2,
            numberOfOutputs: 2,
            channelCount: 2,
            processorOptions: {
              buffers: {
                channel0: [],
                channel1: []
              },
              i: 0,
              promise: void 0,
              resolve: void 0
            }
          }
      });
      let {mediaStreamTrack} = workletStream;
      let mediaStream = new MediaStream();
       // Chromium bug, https://bugs.chromium.org/p/chromium/issues/detail?id=1045832 
       // currentTime does not progress at HTMLMediaElement when addTrack() is called on a MediaStream 
       // set as srcObject on <audio>, <video> where no MediaStreamTrack (getAudioTracks() // []) is previously set
      mediaStream.addTrack(await mediaStreamTrack);
      audio.srcObject = mediaStream;
    })();
  </script>
</body>
</html>

worker.js

// AudioWorkletStream 
// Stream audio from Worker to AudioWorklet (POC)
// guest271314 2-24-2020
import {CODECS} from "./codecs.js";
if (gc) gc();
const delay = async ms => await new Promise(resolve => setTimeout(resolve, ms));
let port;
onmessage = async e => {
  "use strict";
  if (!port) {
    ([port] = e.ports);
    port.onmessage = event => postMessage(event.data);
  }
  let init = false;
  const {
    codecs: [mime, codec],
    urls: [url]
  } = e.data;
  const {
    default: processStream
  } = await
  import (CODECS.get(mime).get(codec));
  let next = [];
  // costs latency, not necessary
  let writes = 0;
  let bytesWritten = 0;
  let samplesLength = 0;
  let portTransfers = 0;
  // https://fetch-stream-audio.anthum.com/2mbps/house-41000hz-trim.wav
  const response = await fetch(url);
  const readable = response.body;
  const writable = new WritableStream({
    async write(value) {
      bytesWritten += value.length;
      // value (Uintt8Array) length is not guaranteed to be multiple of 2 for Uint16Array
      // store remainder of value in next array
      if (value.length % 2 !== 0 && next.length === 0) {
        next.push(...value.slice(value.length - 1));
        value = value.slice(0, value.length - 1);
      } else {
        const prev = [...next.splice(0, next.length), ...value];
        do {
          next.push(...prev.splice(-1));
        } while (prev.length % 2 !== 0);
        value = new Uint8Array(prev);
      }
      // The length is in bytes, but the array is 16 bits, so divide by 2.
      // let length;
      // length = (data[20] + data[21] * 0x10000) / 2; 
      // we do not need length here, we process input until no more, or infinity
      let data = new Uint16Array(value.buffer);
      if (!init) {
        init = true;
        data = data.subarray(22);
      }
      const {
        ch0, ch1
      } = processStream(data);
      do {
        const channel0 = new Float32Array(ch0.splice(0, 128));
        const channel1 = new Float32Array(ch1.splice(0, 128));
        samplesLength += channel0.length + channel1.length;
        port.postMessage({
          channel0, channel1
        }, [channel0.buffer, channel1.buffer]);
        ++portTransfers;
      } while (ch0.length);
      ++writes;
      // wait N ms to avoid choppy output during initial 30 seconds of playback
      // affects total value of writes
      await delay(350);
    }, close() {
      // with await delay(ms) {writes: 340, bytesWritten: 35491032, samplesWritten: 69321}
      // without await delay(ms) {writes: 72, bytesWritten: 35491032, samplesWritten: 69321}
      globalThis.console.log({
        writes // variable
        , bytesWritten // 35491032
        , samplesLength // variable (69320 +/-1)
        , portTransfers // valiable
      }); 
      if (gc) gc();
    }
  }, new CountQueuingStrategy({
    highWaterMark: 128
  }));
  await readable.pipeTo(writable, {
    preventCancel: true
  });

  globalThis.console.log("read/write done"); 
}

codecs.js

export const CODECS = new Map([["audio/wav", new Map([["wav22k2ch","./wav22k2ch.js"]])]]);
// $ opusdec --rate 22050 --force-wav house--64kbs.opus output.wav
// Decoding to 22050 Hz (2 channels)
// Encoded with libopus 1.1
// ENCODER=opusenc from opus-tools 0.1.9
// ENCODER_OPTIONS=--bitrate 64
// Decoding complete. 
//
// $ mediainfo output.wav
// General
// Complete name        : output.wav
// Format     : Wave
// File size  : 33.8 MiB
// Duration   : 6 min 42 s
// Overall bit rate mode: Constant
// Overall bit rate     : 706 kb/s
//
// Audio
// Format     : PCM
// Format settings      : Little / Signed
// Codec ID   : 1
// Duration   : 6 min 42 s
// Bit rate mode        : Constant
// Bit rate   : 705.6 kb/s
// Channel(s) : 2 channels
// Sampling rate        : 22.05 kHz
// Bit depth  : 16 bits
// Stream size: 33.8 MiB (100%)

wav22k2ch.js

// https://stackoverflow.com/a/35248852
export default function int16ToFloat32(inputArray) {
    let ch0 = [];
    let ch1 = [];
    for (let i = 0; i < inputArray.length; i++) {
      const int = inputArray[i];
      // If the high bit is on, then it is a negative number, and actually counts backwards.
      const float = (int >= 0x8000) ? -(0x10000 - int) / 0x8000 : int / 0x7FFF;
      // toggle setting data to channels 0, 1
      if (i % 2 === 0) {
        ch0.push(float);
      } else {
        ch1.push(float);
      }
    };
    return {
      ch0, ch1
    };
  }

audioWorklet.js

class AudioDataWorkletStream extends AudioWorkletProcessor {
  constructor(options) {
    super(options);
    if (gc) {
      gc();
    };
    if (options.processorOptions) {
      Object.assign(this, options.processorOptions);
    }; 
    this.port.onmessage = this.appendBuffers.bind(this);
  }
  appendBuffers({
    data: {
      channel0, channel1
    }
  }) {
    this.buffers.channel0.push(channel0);
    this.buffers.channel1.push(channel1);
    ++this.i;
    if (this.i === 1) {
      this.port.postMessage({"start":true});
      globalThis.console.log({
        currentTime, currentFrame, buffers: this.buffers
      });
    };
  }
  endOfStream() {
    this.port
    .postMessage({
      "ended": true,
      currentTime,
      currentFrame
    });
    globalThis.console.log({
      currentTime, currentFrame, sampleRate, buffers: this.buffers
    });
    if (gc) gc();
  }
  process(inputs, outputs) {
    if (this.i > 0 && this.buffers.channel0.length === 0 && this.buffers.channel1.length === 0) { 
      return false;
    }
    for (let channel = 0; channel < outputs.length; ++channel) {
      const [outputChannel] = outputs[channel];
      let inputChannel;
      if (this.i > 0 && this.buffers.channel0.length > 0) {
        inputChannel = this.buffers[`channel${channel}`].shift();
      } else {
        if (this.i > 0 && this.buffers.channel1.length > 0 && this.buffers.channel0.length === 0) {
          // handle channel0.length === 0, channel1.length > 0
          inputChannel = this.buffers.channel1.shift();
          // end of stream
          this.endOfStream();
        };
      };
      outputChannel.set(inputChannel);
    };
    return true;
  };
};
registerProcessor("audio-data-worklet-stream", AudioDataWorkletStream);

from web-audio-api-v2.

guest271314 commented on August 24, 2024

Theoretically, since the pattern is initiated in this case by fetch() network request to response Body ReadableStream to WritableStream to AudioWorklet which outputs to a MediaStreamTrack, and given cache being disabled at the browser, there should not be any impact on memory, as no content is stored, particularly after the media has completed playback (unless the stream is infinite), meaning there is no reason for any of the API's used to retain the used data in memory for any purpose.

The same procedure should be possible for input 44100 sample rate, or any other codec. Apply the parsing algorithm in main thread, Worker, SharedWorker, module script, etc., then transfer Float32Arrays to AudioWorklet, which avoids decodeAudioData(), AudioBuffer, and AudioBufferSourceNode altogether.

Was able to parse hexadecimal encoded PCM within Matroska file output by MediaRecorder at Chromium using mkvparse.

Next will attempt to parse Opus, both in a WebM and OGG containter, without using WASM, using only the API's shipped with the browser.

from web-audio-api-v2.

guest271314 commented on August 24, 2024

Demonstration with 277MB WAV file https://plnkr.co/edit/nECtUZ?p=preview.

from web-audio-api-v2.

padenot commented on August 24, 2024

Virtual F2F:

Web Codec is low level and flexible and can encode PCM data.
Getting PCM in or out of an AudioContext is done via AudioWorkletNode, and a ring buffer using SharedArrayBuffer or postMessage. This is then fed to a AudioEncoder.

from web-audio-api-v2.

guest271314 commented on August 24, 2024

Web Codec is low level and flexible and can encode PCM data.

AFAICT Web Codec is not specified or implemented. The issues closed reference WebCodecs somehow being capable of connecting to WebAudio API. However, no such evidence exists that is or will be the case. Until then, this issue should remain open.

from web-audio-api-v2.

guest271314 commented on August 24, 2024

@padenot

Getting PCM in or out of an AudioContext is done via AudioWorkletNode, and a ring buffer using SharedArrayBuffer or postMessage. This is then fed to a AudioEncoder.

postMessage() option is not viable in practice. Perhaps in theory that works, but not in production code.

As have already proven in several issues and code examples using postMessage() for a substantial amount of data input will inevitably result in gaps in playback at Firefox and Chrome, Chromium browsers.

from web-audio-api-v2.

[meta] WebCodecs: Clearly define how Web Audio API input will connect to WebCodecs output about web-audio-api-v2 HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent