Git Product home page Git Product logo

hcl's Introduction

Contrastive Learning with Hard Negative Samples

We consider the question: how can you sample good negative examples for contrastive learning? We argue that, as with metric learning, learning contrastive representations benefits from hard negative samples (i.e., points that are difficult to distinguish from an anchor point). The key challenge toward using hard negatives is that contrastive methods must remain unsupervised, making it infeasible to adopt existing negative sampling strategies that use label information. In response, we develop a new class of unsupervised methods for selecting hard negative samples where the user can control the amount of hardness. A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible. The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.

Contrastive Learning with Hard Negative Samples [paper]
Joshua Robinson, Ching-Yao Chuang, Suvrit Sra, and Stefanie Jegelka

Citation

If you find this repo useful for your research, please consider citing the paper

@article{robinson2020hard,
  title={Contrastive Learning with Hard Negative Samples},
  author={Robinson, Joshua and Chuang, Ching-Yao, and Sra, Suvrit and Jegelka, Stefanie},
  journal={International Conference on Learning Representations},
  year={2021}
}

For any questions, please contact Josh Robinson ([email protected]).

Acknowledgements

Part of this code is inspired by leftthomas/SimCLR, by chingyaoc/DCL, and by fanyun-sun/InfoGraph.

hcl's People

Contributors

joshr17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hcl's Issues

A question about STL10 train_transform

Hi,
If I wasn't wrong, it seems that the code apply the CIFAR10-style transforms on STL10 when training. I wonder why we should do like that. Is it a mistake or something?

Many thanks

A question about code

Hi:
In graph/cortex_DIM/functions/gan_loss.py line 89, Eq-=log_,
NameError: name 'log_' is not defined.

Question on negatives reweight implementation

Hi,

I have some difficulties understanding the hard sampling objectives implemented for image tasks.
In your code, the reweight is carried out based on mean of imp rather than the raw neg inner products, which seems to be different than the pseudo-code on line 16, Figure 7 in your paper.

reweight_neg = (imp*neg).sum(dim = -1) / imp.mean(dim = -1)

I wonder which one is correct. To use imp.mean(dim=-1) or neg.mean(dim=-1) as the denominator?

Many thanks,
Qi Yan

Beta in reweight

Thank you for the awesome repository!

Here we use the beta concentration in the exponent:

HCL/image/main.py

Lines 43 to 44 in 36e3934

imp = (beta* neg.log()).exp()
reweight_neg = (imp*neg).sum(dim = -1) / imp.mean(dim = -1)

Is that not different from what is described in the paper where we do not have the beta in the exponent in the pseudocode in Figure 13:

reweight = (beta*neg) / neg.mean()

What am I missing?

[documentation] Pair descriptions

When processing images using the utils script , could you clarify the definitions of the inputs?

I'm assuming:

  • self.targets: a list of reference image
  • self.data: a list of images of an alternate class

In this way I would expect that the __get_item__() would return either

  1. two random transformations of the images from a different class, plus the original target
  2. Or if using the positive pair strategy, return 1 augmented non class image, 1 augmented image of the same class, and the target.

Could you confirm if I have this correct?

Question about reweight in Graph embedding

Hi Josh, thanks for your great repository!
I have a question about the reweight in InfoGraph.

  • In HCL/graph/cortex_DIM/functions/gan_losses.py we have:

    reweight= -2*q_samples / max( q_samples.max(), q_samples.min().abs())
    reweight=(beta * reweight).exp()
    reweight = reweight / reweight.mean(dim=1).view(-1,1)
  • while in Figure 14 of the paper, we have the following pseudocode:

    reweight = 2 * neg / max( neg.max().abs(), neg.min().abs() )
    reweight = (beta * reweight) / reweight.mean()

The paper is 2 * neg and the code is -2*q_samples.
And negative samples closer to the anchor have a smaller reweight. Why are they different?
What am I missing?

Release s2v-model.py for sentence embedding

Hello,

I really enjoy reading your work! The math is beautiful. I was wondering whether you could release the s2v-model.py for your sentence embedding experiments.

Thanks in advance!

question in code

thank you for your great job
have some doubt in code:
why *N in ---> Ng = (-tau_plus * N * pos + reweight_neg) / (1 - tau_plus)
waiting for your responce ,thank you

what's relationship between beta and temperature?

This paper claims that beta can tune the level of hardness:

The level of “hardness” in our method can be smoothly adjusted allowing the user to select the hardness that best trades-off between an improved learning signal from hard negatives, and the harm due to the correction of false negatives being only approximate.

In paper {Understanding the Behaviour of Contrastive Loss} , the authors claim that:

We will show that the contrastive loss is a
hardness-aware loss function, and the temperature τ controls the strength of penalties on hard negative samples

So my question is : Are these two hyperparameters equivalent?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.