mayeaux / generate-subtitles Goto Github PK

Generate transcripts for audio and video content with a user friendly UI, powered by Open AI's Whisper with automatic translations and download videos automatically with yt-dlp integration

Home Page: https://freesubtitles.ai

JavaScript 63.64% CSS 0.05% Pug 36.31%

expressjs libretranslate machine-learning nodejs transcription translation whisper gpu yt-dlp

generate-subtitles's Introduction

generate-subtitles

Generate transcripts for audio and video content with a user friendly UI, powered by Open AI's Whisper with automatic translations powered by LibreTranslate. Live for free public use at https://freesubtitles.ai

Installation:

Under the hood, generate-subtitles uses Whisper AI for creating transcripts and Libretranslate for generating the translations. Libretranslate is optional and not required to run the service.

You can find the installation instructions for Whisper here: https://github.com/openai/whisper#setup

Once Whisper is installed and working properly, you can start the web server.

Make sure you are running Node.js 14+

nvm use 14

You can install Node 14 with nvm:

# install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.2/install.sh | bash

# setup nvm
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"  # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"  # This loads nvm bash_completion

nvm install 14
nvm use 14

Currently the app uses yt-dlp as well, you can install it with:

sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp #download yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp  # Make executable

Then:

git clone https://github.com/mayeaux/generate-subtitles
cd generate-subtitles
npm install
npm start

This should start the server at localhost:3000, at which point if you navigate to there with a browser you should be able to see and use the app.

Using a GPU Cloud Provider

Note: Unless you have a GPU that can use CUDA, you will likely have to use your CPU to transcribe which is significantly less performant, hence why you may have to rent a GPU server from a cloud provider. The only GPU cloud provider that I've had a good experience with is VastAI which is what I use to run https://freesubtitles.ai , if you use this link I should receive a 2.5% of your purchase for the referral: http://vast.ai/?ref=52232

To setup the Vast server to run Whisper, you can use the following script: https://github.com/mayeaux/generate-subtitles/blob/master/docs/install.sh (Note, this script isn't perfect yet but has all the ingredients you need).

While creating the Vast server, you will have to open some ports, this is the configuration I use to achieve that:

Hit EDIT IMAGE & CONFIG..

I select CUDA though it's not 100% necessary

Then hit the SELECT button (the one that's to the right of the CUDA description and not the one next to cancel) and you can add this line to open the ports: -p 8081:8081 -p 8080:8080 -p 80:80 -p 443:443 -p 3000:3000 -p 5000:5000

Hit SELECT & SAVE and when you create an instance it should have the proper ports opened to be able to access the web app. Vast uses port forwarding so when your port 3000 is opened it will be accessed through another port but you should be able to figure that out from their interface.

generate-subtitles's People

Contributors

Stargazers

Watchers

Forkers

notpushkin usergit madrickx odyseeteam cafonso darestrepo acastrauss ak3ra chewtoys nomangul daniel001-odii osantosae oqustudy dexit jorgesb10 migggzz jlingenfelser 0042 neozbr sgeier roschler arunce code02pro jonathanmartel metaluv iperfex-team bugbounted jemmy655 big-data-ai naiml007 spladder87 nodetube xudaiyanzi laravelpro torshinrg nabiibux sanitpeng vu2lid fabiochiquezi hamiltonhzy ideatrails txtsync devonlee111 pangpablo2011 willnilges ringge peagger vsnaichuk syedusama5556 grischka aloproducao rhkdgh255 orcwarrior krutilin osb910 tekstitysapuri-fi kennytat pqtrung lkaicher eagle21st botsarefuture diiego04 victorialei wuliqq aicodehunt tinng81 adam-s-tech kolianik zhuxingwan ddaying chahak-navneet qq1226685735 lmy1108 notmoebius will530 yahya8000 bekirtaskin benayab kanle01 bmedi soadar martvin-github rdwz apollohuang1 cue108 fan-wen kralexus bixo-d newrain7803 fatemach valeriytisch zhaopufeng p4p4n1ck datasigns-in buddalee

generate-subtitles's Issues

Use Facebook's NLLB-200 model for translation

Per suggestion from Reddit, seems like this is a performant model and offers a lot more languages than LibreTranslate

Possible script implementation: https://github.com/pluiez/NLLB-inference

Translation in https://freesubtitles.ai/

On the site https://freesubtitles.ai/ is translation only possible in those 8 languages? Is there the possibility to add Italian? Thank you

Install on Windows 10

Gotten quite close on a Windows platform, but struggling with the last bit.

Error: not found: YoutubeDL

However running "yt-dlp" purely in CMD starts YoutubeDL (so the PATH is correct)
Seems generate-sub code is struggling to find the correct paths or files?

I've dug around in the files, and attempted to add yt-dlp.exe's to several paths

Thoughts?

Please someone help me . Error: not found: YoutubeDL

Hello, I am Installing ytdlp on windows 10.
I am trying to transcribe audio with this repo . https://github.com/mayeaux/generate-subtitles
In the documentation mentioned I properly installed ytdlp you can the verbose outputs.
When i run npm start with https://github.com/mayeaux/generate-subtitles this repo I am failing and getting error

Error: not found: YoutubeDL . The screenshot is Attached Please Check. https://github.com/NabiiBux/errors/blob/main/problem.JPG

Dockerfile or docker-compose.yml - decrease starting friction

You might get more people using this if you packaged it in a nice Dockerfile or docker-compose.yml file to easily have everything installed/packaged.

Using freesubtitles.ai output shows no translation

I sent 4 jobs of transcribing and translating videos.
The parameters for the work were preconfigured from my previous job that I made and it seems like it should translate the subtitles. But it does not for some reason.

So its probably some viaul bug that shows that the language translation is configured while not actually doing it.

Also it would be great to add a function to translate subtitle files individually from transcription if possible.
I have some subtitle files that I would love to translate using m2m100.

Download subtitles automatically?

After I upload my file, I don't want a video to be played. Nor any other options.

I just want my subtitles downloaded.

Can you make it so that just the subtitles are downloaded?

Its a waste of bandwidth to stream the video again & for some reason, subtitles aren't easily downloadable either after converting.

I have to wait a long time even though the conversion is already done.

Why can't it just downloads automatically after conversion is done? It'll be a better UX too.

Why so much friction?

Futurepedia Embed

Hey Mayeaux, I'm the founder of Futurepedia - a free directory of 300+ AI tools.
I had listed your tool on my site a few days ago and it has become really popular on it garnering >35 favourites: https://www.futurepedia.io/tool/free-subtitles-ai

I'm now releasing a feature that would allow sites to embed their Futurepedia favourite count. You can view an example over here: https://www.webcopilot.co/
If you're interested, please let me know and I can create a pull request.

What settings are you using?

If I use your service with this audio - https://www.gutenberg.org/files/20973/ogg/20973-01.ogg - it captures the entire transcription. If I use it with replicate's default settings (https://replicate.com/openai/whisper), it misses a lot, including the first few sentences. What settings are you using? I'm trying to debug this.
Thanks!

upload failed

I'm having trouble uploading audio files, the upload stops when almost done and it says files may be corrupt.
only today i started using the website, it worked for few times and then it stoped working, the files are m4a download with yt-dl , but i even tried an mp3 music file and still it didn't work.

How to force model

How to force model?

I wan't to train whisper to better in finnish... but then I have to force this frontend to use that model, when I host it

Use Whisper API instead of GPU or third party cloud GPU?

Hey, is it possible to add the ability to just hook into the official Whisper API to process on their servers rather than locally or on a third party cloud gpu, similar to what you can do the ChatGPT?

Federation/sharing the load

Hey it's me again from HN! How should we get started on putting something together where I (and potentially others down the road) can provide more GPU for this?

I like docker and I like traefik. I also like Cloudflare but I'm not married to any of these things, I'd just like to help out and give my RTX 3090 more stuff to do :).

does it support multiple files or yt playlist?

Error npm start

[email protected] start /root/generate-subtitles
node app.js

maxConcurrentJobs
node env
development
FILES PASSWORD: undefined
{ nodeEnv: 'development' }
{ uploadLimitInMB: 3000 }
/root/generate-subtitles/node_modules/which/lib/index.js:111
throw getNotFoundError(cmd)
^

Error: not found: ffprobe
at getNotFoundError (/root/generate-subtitles/node_modules/which/lib/index.js:16:17)
at AsyncFunction.whichSync [as sync] (/root/generate-subtitles/node_modules/which/lib/index.js:111:9)
at Object. (/root/generate-subtitles/routes/transcribe.js:10:27)
at Module._compile (internal/modules/cjs/loader.js:1114:14)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1143:10)
at Module.load (internal/modules/cjs/loader.js:979:32)
at Function.Module._load (internal/modules/cjs/loader.js:819:12)
at Module.require (internal/modules/cjs/loader.js:1003:19)
at require (internal/modules/cjs/helpers.js:107:18)
at Object. (/root/generate-subtitles/app.js:53:20) {
code: 'ENOENT'
}
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] start: node app.js
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR! /root/.npm/_logs/2023-07-22T00_22_20_603Z-debug.log

Error npm start | SyntaxError: Unexpected token '.'

Both with main and fix-dependencies-issue branch
Ubuntu 22.04 clean Proxmox, whisper, libretranslate, rusttools installed

/root/generate-subtitles/queue/newQueue.js:34
const matchesByWebsocket = global.jobProcesses[processNumber]?.websocketNumber === websocketNumber;

SyntaxError: Unexpected token '.'

Has libreTranslate been fully implemented?

I have a need to parse my data locally, and would be interested to know if the libreTranslateWrapper is fully impletement.

I have successfully installed libreTranslate instance locally and test some English to other languages.

Warnings during installion. SyntaxError: Unexpected identifier After running.

When I run npm install
I get the following critical warnings.

PS H:\GitHub\generate-subtitles> npm install
npm WARN deprecated [email protected]: Deprecated, use jstransformer
npm WARN deprecated [email protected]: Please update to at least constantinople 3.1.1
npm WARN deprecated [email protected]: Jade has been renamed to pug, please install the latest version of pug instead of jade

added 59 packages, removed 19 packages, changed 31 packages, and audited 292 packages in 4s

31 packages are looking for funding
  run `npm fund` for details

13 vulnerabilities (2 low, 4 moderate, 3 high, 4 critical)

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.

Running npm audit

PS H:\GitHub\generate-subtitles> npm audit
# npm audit report

clean-css  <4.1.11
Regular Expression Denial of Service in clean-css - https://github.com/advisories/GHSA-wxhq-pm8v-cw75
fix available via `npm audit fix --force`
Will install [email protected], which is a breaking change
node_modules/clean-css
  jade  >=0.30.0
  Depends on vulnerable versions of clean-css
  Depends on vulnerable versions of constantinople
  Depends on vulnerable versions of transformers
  node_modules/jade

constantinople  <3.1.1
Severity: critical
Sandbox Bypass Leading to Arbitrary Code Execution in constantinople - https://github.com/advisories/GHSA-4vmm-mhcq-4x9j
fix available via `npm audit fix --force`
Will install [email protected], which is a breaking change
node_modules/constantinople

debug  <2.6.9
Regular Expression Denial of Service in debug - https://github.com/advisories/GHSA-gxpj-cx7g-858c
fix available via `npm audit fix --force`
Will install [email protected], which is outside the stated dependency range
node_modules/debug
  body-parser  <=1.18.1
  Depends on vulnerable versions of debug
  Depends on vulnerable versions of qs
  node_modules/body-parser
  morgan  <=1.9.0
  Depends on vulnerable versions of debug
  node_modules/morgan

express  <=4.17.2 || 5.0.0-alpha.1 - 5.0.0-alpha.7
Severity: high
qs vulnerable to Prototype Pollution - https://github.com/advisories/GHSA-hrpp-h998-j3pp
Depends on vulnerable versions of qs
Depends on vulnerable versions of send
Depends on vulnerable versions of serve-static
fix available via `npm audit fix --force`
Will install [email protected], which is outside the stated dependency range
node_modules/express

mime  <1.4.1
Severity: moderate
mime Regular Expression Denial of Service when mime lookup performed on untrusted user input - https://github.com/advisories/GHSA-wrvr-8mpx-r7pp
fix available via `npm audit fix --force`
Will install [email protected], which is outside the stated dependency range
node_modules/mime
  send  <=0.15.6
  Depends on vulnerable versions of mime
  node_modules/send
    serve-static  <=1.12.6
    Depends on vulnerable versions of send
    node_modules/serve-static


qs  <=6.2.3 || 6.5.0 - 6.5.2
Severity: high
Prototype Pollution Protection Bypass in qs - https://github.com/advisories/GHSA-gqgv-6jq5-jjj9
qs vulnerable to Prototype Pollution - https://github.com/advisories/GHSA-hrpp-h998-j3pp
qs vulnerable to Prototype Pollution - https://github.com/advisories/GHSA-hrpp-h998-j3pp
fix available via `npm audit fix --force`
Will install [email protected], which is outside the stated dependency range
node_modules/express/node_modules/qs
node_modules/qs

uglify-js  <=2.5.0
Severity: critical
Incorrect Handling of Non-Boolean Comparisons During Minification in uglify-js - https://github.com/advisories/GHSA-34r7-q49f-h37c
Regular Expression Denial of Service in uglify-js - https://github.com/advisories/GHSA-c9f4-xj24-8jqx
fix available via `npm audit fix --force`
Will install [email protected], which is a breaking change
node_modules/transformers/node_modules/uglify-js
  transformers  2.0.0 - 3.0.1
  Depends on vulnerable versions of uglify-js
  node_modules/transformers

13 vulnerabilities (2 low, 4 moderate, 3 high, 4 critical)

To address all issues (including breaking changes), run:
  npm audit fix --force

When I run npm audit fix --force a couple times, the issue goes away and it installs.
I run npm start. It starts up without errors. But when I navigate to the site, it throws the following error.

SyntaxError: Unexpected identifier
    at new Function (<anonymous>)
    at exports.compile (H:\GitHub\generate-subtitles\node_modules\jade\lib\jade.js:171:8)
    at exports.render (H:\GitHub\generate-subtitles\node_modules\jade\lib\jade.js:205:17)
    at exports.renderFile [as engine] (H:\GitHub\generate-subtitles\node_modules\jade\lib\jade.js:233:13)
    at View.render (H:\GitHub\generate-subtitles\node_modules\express\lib\view.js:135:8)
    at tryRender (H:\GitHub\generate-subtitles\node_modules\express\lib\application.js:657:10)
    at Function.render (H:\GitHub\generate-subtitles\node_modules\express\lib\application.js:609:3)
    at ServerResponse.render (H:\GitHub\generate-subtitles\node_modules\express\lib\response.js:1039:7)
    at H:\GitHub\generate-subtitles\app.js:133:7
    at Layer.handle_error (H:\GitHub\generate-subtitles\node_modules\express\lib\router\layer.js:71:5)

Disable the translations

It would be nice if we could have an option to disable (or enable) the translations.
It's a nice feature but it takes ages to translate, much more time than the transcription by itself:

Seems to be stuck at processing screen

Is there a way to track the progress and see how much time is left?

error on npm start

Hi,

I tried installing on Linux and on Windows, and both times I got the same error:

[email protected] start
node app.js

G:\PyEnvs\whisp\generate-subtitles\node_modules\which\lib\index.js:106
throw getNotFoundError(cmd)
^

Error: not found: yt-dlp
at getNotFoundError (G:\PyEnvs\whisp\generate-subtitles\node_modules\which\lib\index.js:16:17)
at AsyncFunction.whichSync [as sync] (G:\PyEnvs\whisp\generate-subtitles\node_modules\which\lib\index.js:106:9)
at Object. (G:\PyEnvs\whisp\generate-subtitles\download.js:9:31)
at Module._compile (node:internal/modules/cjs/loader:1159:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1213:10)
at Module.load (node:internal/modules/cjs/loader:1037:32)
at Module._load (node:internal/modules/cjs/loader:878:12)
at Module.require (node:internal/modules/cjs/loader:1061:19)
at require (node:internal/modules/cjs/helpers:103:18)
at Object. (G:\PyEnvs\whisp\generate-subtitles\routes\index.js:7:31) {
code: 'ENOENT'
}

I don't know my way around Node.js very well, so apologies if I've made a stupid mistake. I used the freesubtitles.ai server and I thought it was excellent!

Cheers,

Jon

How is your instance hosted?

What service are you using, and what are the specs of the server?

Any installation guide or manual for local running?

Wonderful Project. Any local installation and running guide/manual readme.md file?

How Does you Handel Parrel Connections

I have one question when you have multiple request (Multiple user want to transcribe at same time) how you handle it. As per my knowledge, Whisper can only transcribe one file at time. Is their any way to do transcribe multiple files at once

Implement whisper.cpp

whisper.cpp is very performant: https://github.com/ggerganov/whisper.cpp

According to my testing, a $200 server with several TB of storage from OVH will perform a bit faster than equivalent Vast.ai servers which are somewhat unreliable and have limited storage space ($250 for 50GB storage).

Will need to refactor https://github.com/mayeaux/generate-subtitles/blob/master/transcribe/transcribing.js to wrap whisper.cpp , will also need a setup script to download the models and $ make the main file.

Progress reporting is implemented, frames/second is listed as a future enhancement: ggerganov/whisper.cpp#276

Timecode issue and out of sync

Seems like each sub-title is a minimum of 2 seconds long. When you have several sub-titles that should only be 1 seconds long and they appear next to each other then everything gets out of sync. Is there something that could be done for having better sync?