Comments (8)
code has been uploaded to master to load the xing dataset check https://github.com/fair-search/fairsearch-elasticsearch-plugin/blob/master/demo/load-xing-dataset.rb for details.
from fairsearch-fair-for-elasticsearch.
Also a dummy query used to check manually the operations using this dataset has also been pushed to master, check https://github.com/fair-search/fairsearch-elasticsearch-plugin/blob/master/demo/xing.query
from fairsearch-fair-for-elasticsearch.
@chatox @tsuehr it would be nice to have a list of query (term, precision, significance and k) with the amount of results expected to be returned for the algorithm?.I am not sure I can find this kind of information from the paper.
from fairsearch-fair-for-elasticsearch.
This need to be constructed synthetically. Example:
query = hello
doc1 = hello hello hello hello
doc2 = hello hello hello bye
doc3 = hello hello bye bye
doc4 = hello bye bye bye
doc5 = bye bye bye bye
Now, by assigning different genres to genre1 ... genre5, one can generate expected result lists in different orderings. This depends on table p.
I suggest not to tie this to the German credit score dataset, but instead do it generically with a synthetic examples such as the one I've shown.
from fairsearch-fair-for-elasticsearch.
that works for me, we can also do that. Would you be so nice to prepare a dummy test set, including expected number of answers (per protected category) that we can translate into an integration test in the plugin? just to make sure we do the right verifications.
from fairsearch-fair-for-elasticsearch.
Sure that would be based in some mtable prepared by @tsuehr
from fairsearch-fair-for-elasticsearch.
@chatox test according to what we have spoken and what you teach us here in this issue has been created and pushed at
will also add next days more edge cases with few protected elements vs lots of protected, etc...
from fairsearch-fair-for-elasticsearch.
more tests has been added 806b804
from fairsearch-fair-for-elasticsearch.
Related Issues (20)
- Add https://readthedocs.org/ as source of our plugin documentation HOT 2
- Find out what is the highest ranking enhancement for a protected candidate HOT 1
- Create a unit test for the generation of table M HOT 7
- Write the AlphaAdjustment as an utility Class HOT 2
- Mtable names should be automatically generated
- The adjust alpha process and the mtable generation are disconnected
- Configure min k,p,alpha for the first release HOT 1
- Verify the case where k < windowsize HOT 3
- Throw an exception if k is too small or p is too small HOT 1
- Deal with cases in which p and k are small, and alpha is large, so no re-ranking is necessary HOT 1
- Fix build against Elasticsearch 6.3.x HOT 1
- Remove or document the restriction that index must have at most 1 shard + 1 replica for the plugin to work HOT 1
- Catch shard failure and throw reasonable exception when doing a fair query against a non-existing mtable
- FairRescoreBuilder: Queue construction for fair Rescore should be reviewed HOT 1
- If no results, es tries to create an mtable with k=0 HOT 1
- Create an integration test case based on the German credit score dataset HOT 4
- Performance tests
- Distribution method HOT 1
- M table calculations should be cached in elasticsearch
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fairsearch-fair-for-elasticsearch.