Comments (6)
hi @namespace-Pt
Sorry for the late answer!
I am not sure I completely got your point. For the FLOPS estimation, we rely on the derivation from Minimizing FLOPs to Learn Efficient Sparse Representations. The probabilities are directly estimated from the length of the posting lists (for both documents and queries, where for the later we simply "index" them).
let us know if you need more details,
Thibault
from splade.
Thank you @thibault-formal! I'll read the paper.
BTW, do you have a checkpoint of SPLADEv2-distill
? I tried to reproduce your result but failed. I found that using the distillation we donot need a ground-truth passage to each query, so did you use the queries that are absent from qrels.train.tsv
to train the student model?
from splade.
I would appreciate it if you could also provide the SPLADEv2-max
(MRR@10=0.34) checkpoint. I want to see how the model learns to distribute the tokens.
from splade.
Hi @namespace-Pt ,
- We have some models available here: https://europe.naverlabs.com/research/machine-learning-and-optimization/splade-models/. New models will come soon, the SPLADEv2-max is not there yet but we can add it even if we have better models than this one trained without distillation.
- For your original question on the FLOPS, I understood that you were asking whether we should measure the FLOPS (Q*D) versus FLOPS(Q) + FLOPS(D). We have tried that and it did not change much the results. Sometimes, it is better to adjust differently the query and document representation.
from splade.
@sclincha @namespace-Pt the weights for SPLADEv2-max
can actually be found in the weights
folder in this repo.
from splade.
@sclincha @thibault-formal Thank you! I'll check it out.
from splade.
Related Issues (20)
- Training by dot product and evaluation via inverted index? HOT 2
- configuration for splade++ results HOT 2
- Great job!
- Clustering HOT 2
- Fine Tuning HOT 3
- Flops calcualtion HOT 3
- Cannot train SPLADEv2 to achieve the reported performance. HOT 6
- Instructions on Using Pisa for Splade HOT 4
- Dockerized environment to run splade HOT 5
- Multilingual version of SPLADE HOT 15
- Python version 3.8 or 3.9? HOT 4
- Zero-dimension query embedding HOT 2
- This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access. HOT 2
- Use interactively without indexing? HOT 3
- Normalizing SPLADE embeddings - a bad idea? HOT 3
- Training SPLADE with a smaller dataset? HOT 2
- YAML Installation doesn't work from macOS with mini conda HOT 1
- Running SPLADE in production (Render python server) HOT 6
- Evaluation on MSMARCO? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from splade.