Comments (6)
Hi @zwsjink! For the main image-based filtering baseline in the paper, we used L/14 features, so to replicate this baseline, only the L/14 features are needed. B/32 features are used for other baselines (e.g., clip score filtering baselines). Hope that answers the question, but feel free to let me know if not!
from datacomp.
Hi @zwsjink I agree with your conclusion! And great point about the small differences between B/32 and L/14 features for filtering and clustering. We were indeed surprised that using a stronger CLIP backbone for clustering/filtering did not lead to large gains in downstream performance.
This brings up interesting questions related to what makes a good dataset filtering model, which at least from this comparison, seems slightly different than what makes a good zero-shot model.
from datacomp.
While the difference between clip score filtering and image-based filtering is within a percentage point (pp), we found it interesting that “stacking” these filtering methods produce a substantial gain (approx. 3pp). There is definitely more here to understand (i.e., when one should stack filtering methods).
We experimented with IN1k and IN21k for image-based filtering just because these are common datasets and seemed like reasonable baselines. Investing additional datasets for filtering is also an interesting direction!
from datacomp.
Hi @zwsjink! For the main image-based filtering baseline in the paper, we used L/14 features, so to replicate this baseline, only the L/14 features are needed. B/32 features are used for other baselines (e.g., clip score filtering baselines). Hope that answers the question, but feel free to let me know if not!
Thanks for the clarification. Yeah, I do see in clip-score based filter step, both l14 and b32 are supported. I suppose you guys have tried image-based filtering on both L14 and B32 embeddings and find that L14 outperform B32?
from datacomp.
Hi @zwsjink! For the main image-based filtering baseline in the paper, we used L/14 features, so to replicate this baseline, only the L/14 features are needed. B/32 features are used for other baselines (e.g., clip score filtering baselines). Hope that answers the question, but feel free to let me know if not!
Thanks for the clarification. Yeah, I do see in clip-score based filter step, both l14 and b32 are supported. I suppose you guys have tried image-based filtering on both L14 and B32 embeddings and find that L14 outperform B32?
Just want to add my observation here, by going through the data provided in Table21-24 in the paper, there is not too much benefit when moving from B32 to L14 in clip score thresholding, at least in small/medium/large . So that's why I'm wondering if it worth using L14 for both clip score thresholding and image-based filtering. Probably, B32 is enough for me to get an acceptable accuracy and more compute/storage resource friendly
from datacomp.
@sagadre I also spot that compared with clip-score thresholding, image-based filter does not bring too much benefit.
is there a reason for you guys using Imagenet to do a relevance filter here?
from datacomp.
Related Issues (20)
- Usage with AWS S3 and Ray HOT 5
- FMoW dataset and results variance HOT 1
- Dataset Size on Leaderboard HOT 1
- Conda environment build issue HOT 3
- 14% of SHA256 hashes not matching HOT 32
- the normal success rate and downloading speed? HOT 1
- `zeroshot_templates` split error for FairFace / UTKFace HOT 9
- Deduplication against evaluation sets HOT 1
- Remove CSAM, if present HOT 2
- Metadata for datacomp-large text-based filter HOT 1
- Pretraining dataset HOT 1
- Training log HOT 1
- Frequency of Leaderboard Updates HOT 1
- About update metadata with the corresponding image sample in shards HOT 2
- ModuleNotFoundError: No module named 'training' HOT 2
- Availability of npy indices for large pool
- Average caption length for CommonPool HOT 1
- Downloading Commonpool XLarge
- ImageNet 21k based filtered dataset HOT 1
- Invalid files for Datacomp1B
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datacomp.