Light

v0 about video2dataset HOT 5 CLOSED

iejmac commented on August 21, 2024

v0

from video2dataset.

Comments (5)

rom1504 commented on August 21, 2024

Priority queue:

from video2dataset.

iejMac commented on August 21, 2024

@rom1504
Making a summary/list of what needs to be done pre-release. Going over the code as it's executed and noting my thoughts:

main.py

dict as input type for encode_formats. Works when calling video2dataset from python but not sure how this works with fire/CLI. Any thoughts?

worker.py

make this cleaner. ifs are ugly and doesn't get the point across, point is that the video already has the audio so you want something like bytes_downloaded += max(streams.get("video", 0), streams.get("audio", 0))
clipping subsampler should take encode_formats as init param
we can make this nicer by doing something like broadcast_subsampler = clipping_sub if "clips" in whatever else noop_sub and then just call that with streams and meta
idea to get rid of this and listing out all subsamples, as we add more we shouldn't have to add another if statement in the worker loop. The idea is to initialize worker with a list of non-None f"{modality}_subsamplers" list and then just iterate over all of those. The reason the attribute would be called f"{modality}_subsamplers" is because instead of checking if we have "video" in streams we can just iterate over all the modalities in streams and retrieve the ```eval(f"self.{modality}_subsamplers")
format_type isn't argument to writer

data_reader.py

data_writer.py

think about this, do we need to be iterating over encode_formats? maybe it's enough to just iterate over streams and write nothing for cases where a modality isn't present in streams. When does this happen? Do we want to write an empty meta then?
writers need testing

subsamplers/clipping_subsampler.py

same as with noop_subsampler, encode_formats should be init param not call param
need to check that audio clipping works as intended though i.e. lines up with video and correct number of clips correctly ID'd etc.

tests/test_downloaders.py

doesn't test if audio is downloaded at all, needs to be added

tests/test_audio.py

should be renamed to test_reader.py and we should actually test the reader
actually pretty good just needs to be adapted a bit and add more parameters to test if the video is reading properly etc. we should also test that the correct error_messages are being returned (or exceptions being thrown if we decide to merge that one PR)

README.md

examples getting too long and unintuitive, we should just make more things in examples directory
specifically let's show how to use encode_formats and other params like that
also let's add a tutorial for how to run this with distributed=spark, I think that's not obvious but very useful.
maybe add citation?

besides the above cleanup there's 2 more PR's to get merged:

#91 - improves subtitle support and fixes a few things, 100% needs to get merged
#92 - I think this is worth trying and considering
#80 - check if it's done

v1 ideas

While going through the code I had some ideas for v1:

if encode_formats has both video and audio perhaps we should try to do most of the pipeline with video and audio in 1 mp4 byte stream instead of separate video and audio streams so subsampling such as clipping can be done together and then we can split it up instead of the other way

from video2dataset.

iejMac commented on August 21, 2024

Delaying, we need to get some successful use case of data from this repo. Either SVD or VideoCLIP or whatever.

from video2dataset.

rom1504 commented on August 21, 2024

I would call this done

codebase are using this

from video2dataset.

iejMac commented on August 21, 2024

yeah sure, maybe it would be good to update pip package if this is the case

from video2dataset.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.