Git Product home page Git Product logo

captions's Introduction

3Blue1Brown captions

This repository contains captions for videos on 3blue1brown. The files of the form year/video-id/language/sentence_translations.json are used to automatically create subtitles, and the goal is to use them to create automatic dubbing as well. As a proof of concept for the aim here, on this video, click the gear icon, and change the audio track to "Korean".

Most of this repo was formed with a script that first used Whisper to transcribe video narration into English with punctuation and then used a translation API to translate that script sentence-by-sentence, recording the timings. Machine translation is often far from perfect, so I'll only feel comfortable with the final results here if those translations have been looked over by a native speaker.

How to help

We're figuring out the best way to incorporate community contributions. If you'd like to help out, it might be best to hold off until a proper system is in place, and note your interest in this Google form.

The existing imperfect flow for contribution looks roughly like this:

  • Check the existing pull requests to see if anyone has already submitted edits to the translation you're interested in reviewing.
  • You also can use this discord for coordination and discussion.
  • Navigate to a year/video-id/language that you are interested in reviewing.
  • Click on the "sentence_translation.json" file
    • For example here is one for translations of the Central Limit Theorem video into Hindi.
    • If no such file is available, feel free to add an issue to this repository to request it.
  • By clicking "blame", you can see if there have been other contributors to this file, with links to the relevant commits and pull requests.
  • By clicking the pencil icon in the upper right, you can offer edits directly in the browser. Opening it in github.dev may offer a nicer experience there.
  • You can then submit those edits as a pull request.
    • It's most helpful if the title of the pull request includes the language and video name.
  • Here is my extremely imperfect means of recording which files have undergone some form of review and been merged.

Once edits to one of the sentence_translations.json files are made, subtitles can be generated automatically, and the plan is also to use this data to create automatic dubbings.

For at least the next week or two I'd recommend holding off on making any contributions. We'll take the existing contributions and use them to do some initial experiments on the text-to-voice part of the pipeline, and in all likelihood, we'll put together a better system for community contribution than raw edits to JSON files.

captions's People

Contributors

3b1b avatar dlatikaynen avatar imlargo avatar renelle29 avatar rajeshwar-pandey avatar yag000 avatar dat-pudding avatar dlluatic avatar saurabh-git-dev avatar tebaioioo avatar realcalal avatar cheskoxd avatar marcelolynch avatar agustin-j avatar hinum avatar gabboronco avatar runningtooblivion avatar jsenn2 avatar trinetra75 avatar norude avatar luiz12apn avatar prateekbansal97 avatar epsalon avatar davidbar-on avatar evezz avatar explosion-scratch avatar giorey01 avatar greenst0ne avatar iliakonnov avatar jomri69 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.