Git Product home page Git Product logo

Comments (2)

mihastele avatar mihastele commented on June 18, 2024

@KOLANICH I could also look into this, do you have any sample videos where I could try working on?

Have a great day!
All the best from Slovenia.

from ideas.

KOLANICH avatar KOLANICH commented on June 18, 2024

I have no sample videos, and no dataset. You need not the dataset of videos, but a dataset of title card frames of them. I have no such dataset and don't inow where to get it. A good heuristics is the presence of stylized text within frames tyat can probably be detected by other neural network. Anyway, annotation using GPT-4 and other near-AGI models should be helpful. If you have a videocollection, it should contain quite some videos containing title cards. Also quite some videos from YouTube should contain them.

I guess one can start from detection of title screens of presentations. Usually they are the first slide of a presentation, and presentations can be harvested from internet using their filename extension. The title screens can be augnemted by style transfer neural networks to make them stylezed and less text-like.

After a model recognizing title screens of presentations is trained, one can try to recognize title screens of real videos from youtube with it. In order to get title screens you don't need whole videos, the title screens are usually in the first few minutes of them, and for presentations are within first few seconds. There are quite some of videos containing title slide of presentations often overlayed by other objects like persons standing or webcam overlays. After annotation using text+pic AGI models with prompts like "does this slide look like a title" this dataset can be used to train the next generation model.

After that, the new model is applied to videos (everyone knows where to get them) containing very stylized title cards, and again the results are verified using AGI. Certain kinds of videos have title cards on exactly the same timings, it is very widespread, so it may make sense to add the detection of this case.

I guess it is the way to get dataset. Bootstrap and ikprove evolutionary, not try to make the perfect model forom the very first dataset obtained (this would require a dataset it is infeasible to create).

from ideas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.