Git Product home page Git Product logo

fluester's Introduction

fluester – [ˈflʏstɐ] CI CD version downloads License

Node.js bindings for OpenAI's Whisper. Hard-fork of whisper-node.

Features

  • Output transcripts to JSON (also .txt .srt .vtt)
  • Optimized for CPU (Including Apple Silicon ARM)
  • Timestamp precision to single word

Installation

Requirements

  • make and everything else listed as required to compile whisper.cpp
  • Node.js >= 20
  1. Add dependency to project
npm install @pr0gramm/fluester
  1. Download whisper model of choice
npx --package @pr0gramm/fluester download-model
  1. Compile whisper.cpp if you don't want to provide you own version:
npx --package @pr0gramm/fluester compile-whisper

Usage

Important: The API only supports WAV files (just like the original whisper.cpp). You need to convert any files to a supported format before. You can do this using ffmpeg (example taken from the whisper project):

ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav

Translation

import { createWhisperClient } from "@pr0gramm/fluester";

const client = createWhisperClient({
  modelName: "base",
});

const transcript = await client.translate("example/sample.wav");

console.log(transcript); // output: [ {start,end,speech} ]

Output (JSON)

[
  {
    "start": "00:00:14.310", // timestamp start
    "end": "00:00:16.480", // timestamp end
    "speech": "howdy" // transcription
  }
]

Language Detection

import { createWhisperClient } from "@pr0gramm/fluester";

const client = createWhisperClient({
  modelName: "base",
});

const result = await client.detectLanguage("example/sample.wav");
if(!result) {
  console.log(`Detected: ${result.language} with probability ${result.probability}`);
} else {
  console.log("Did not detect anything :(");
}

Tricks

This library is designed to work well in dockerized environments.

We took time and made some steps independent from each other, so they can be used in a multi-stage docker build.

FROM node:latest as dependencies
    WORKDIR /app
    COPY package.json package-lock.json ./
    RUN npm ci

    RUN npx --package @pr0gramm/fluester compile-whisper
    RUN npx --package @pr0gramm/fluester download-model tiny

FROM node:latest
    WORKDIR /app
    COPY --from=dependencies /app/node_modules /app/node_modules
    COPY ./ ./

This includes the model in the image. If you want to keep your image small, you can also download the model in your entrypoint using the commands above.

Made with

Roadmap

  • Nothing ¯\_(ツ)_/¯

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.