This repository contains the code for the paper
Topics to Avoid: Demoting Latent Confounds in Text Classification Sachin Kumar , Shuly Wintner, Noah Smith, Yulia Tsvetkov
- Pytorch 0.3.0
- Python >= 3.6
Code for "Topics to Avoid: Demoting Latent Confounds in Text Classification"
This repository contains the code for the paper
Topics to Avoid: Demoting Latent Confounds in Text Classification Sachin Kumar , Shuly Wintner, Noah Smith, Yulia Tsvetkov
Hi,
Thanks for releasing the code.
Can you please answer a few questions?
Does this code reproduce results from your EMNLP paper? And when do you plan to release all the preprocessing scripts along with readme?
I tried preprocessing the reddit l2 dataset the way you described in the paper (posts greater than 50 words and balanced dataset for all classes). But I did not get the numbers of the dataset you reported (260k for training, 32k for test, valid). Are these numbers for 10 most frequent classes or all 23 of them?
In your arxiv version, the numbers for a linear classifier (LR) in tables 1 and 4 do not match (52.5 vs 21.1 for in-domain). I assume it is because the num_classes in former is 10 while it is 23 in the latter?
Can you share the exact splits (train, test, oodtest) you used in your paper?
Thanks again for releasing the code,
Ashim
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.