Git Product home page Git Product logo

livecaptions-backup-11may2023's Introduction

Live Captions

Screenshot of the application

Live Captions is an application that provides live captioning for the Linux desktop.

Download on Flathub

Join the Discord Chat if you're interested in keeping in touch.

Only the English language is supported currently. Other languages may produce gibberish or a bad phonetic translation.

Features:

  • Simple interface
  • Caption desktop/mic audio locally, audio is never sent anywhere
  • Does not rely on any proprietary services/libraries
  • Adjust font, font size, and text casing
  • Optional token-level confidence text fading

Running this requires a somewhat-decent CPU that can perform realtime captioning, especially if you want to be doing other tasks (such as video decode) while running Live Captions. It has been tested working on:

  • Intel i7-2670QM (2011)
  • Intel i5-8265U (2018)
  • AMD Ryzen 5 1600 (2017)
  • Steam Deck

GPU is not required or used.

Accuracy

The live captions may not be accurate. It may make mistakes, including when it comes to numbers. Please do not rely on the results for anything critical or important.

More models may be trained and released in the future with better and more robust accuracy.

Library

This application is built using aprilasr, a new library for realtime speech recognition.

Model credit

Thanks to Fangjun Kuang for the pretrained model, and thanks to the icefall contributors for creating the model recipes.

Building

You must make sure to do a recursive clone to get dependencies:

$ git clone --recursive https://github.com/abb128/LiveCaptions.git

If you forgot, you can initialize submodules like so:

$ git submodule update --init --recursive

Option 1: Building with GNOME Builder (easy)

You can build this easily with GNOME Builder. After cloning, open the project directory in GNOME Builder, download the SDK if it asks you, and click the play button to build and run.

If you are using Flatpak GNOME Builder and experience issues running this (for example, some cryptic X Window System error), please try using your distro's native packaged version of GNOME Builder instead of Flatpak (e.g. sudo apt install gnome-builder).

Option 2: Building from the terminal (not as easy)

First you must download ONNXRuntime v1.13.1, extract it somewhere, and set the environment variables to point to it:

$ export ONNX_ROOT=/path/to/onnxruntime-linux-x64-1.13.1/
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/onnxruntime-linux-x64-1.13.1/lib

Alternatively you should also be able to locally build and install ONNXRuntime, in which case that step shouldn't be necessary.

To set up a build, run these commands:

$ meson setup builddir
$ meson devenv -C builddir

Now you can build the application by running ninja.

Before being able to run the app, you must also download the model and export APRIL_MODEL_PATH to where the model is. For example:

$ wget https://april.sapples.net/aprilv0_en-us.april
$ export APRIL_MODEL_PATH=`pwd`/aprilv0_en-us.april

You should now be able to run the app with src/livecaptions

livecaptions-backup-11may2023's People

Contributors

abb128 avatar theevilskeleton avatar nekowinston avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.