Git Product home page Git Product logo

mixcse_aaai2022's Introduction

MixCSE_AAAI2022

A PyTorch implementation for our paper "Unsupervised Sentence Representation via Contrastive Learning with Mixing Negatives".

You can download the paper from here.

Abstract

Unsupervised sentence representation learning is a fundamental problem in natural language processing. Recently, contrastive learning has made great success on this task. Existing constrastive learning based models usually apply random sampling to select negative examples for training. Previous work in computer vision has shown that hard negative examples help contrastive learning to achieve faster convergency and better optimization for representation learning. However, the importance of hard negatives in contrastive learning for sentence representation is yet to be explored. In this study, we prove that hard negatives are essential for main�taining strong gradient signals in the training process while random sampling negative examples is ineffective for sentence representation. Accordingly, we present a contrastive model, MixCSE, that extends the current state-of-the-art SimCSE by continually constructing hard negatives via mixing both positive and negative features. The superior performance of the proposed approach is demonstrated via empirical studies on Semantic Textual Similarity datasets and Transfer task datasets

Requirement

  • Python = 3.7
  • torch = 1.11.0
  • numpy = 1.17.2
  • transformers = 4.19.2

train

bash run_unsup_example.sh

evaluate

python evaluation.py \
    --model_name_or_path trained_model \
    --pooler cls \
    --task_set sts \
    --mode test

Citation

If this work is helpful, please cite as:

@article{zhang2022unsupervised,
  title={Unsupervised Sentence Representation via Contrastive Learning with Mixing Negatives},
  author={Zhang, Yanzhao and Zhang, Richong and Mensah, Samuel and Liu, Xudong and Mao, Yongyi},
  year={2022}
}

License

MIT

mixcse_aaai2022's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mixcse_aaai2022's Issues

Reproducibility

Hello, thanks for presenting the codes for your work of MixCSE.
While running your codes, I found some issues.
I succeeded in running the codes by changing some lines, but being not sure about changes.
I wonder if these changes are fine.

  1. data/sts-dev.tsv
    Because I could not find the files named sts-dev.tsv in this repository, I changed the code to use SimCSE/SentEval/data/downstream/STS/STSBenchmark/sts-dev.csv that I can download from the SimCSE github (https://github.com/princeton-nlp/SimCSE) when running the train.py file.
    I also changed the default argument of eval_path in evaluation.py.
    Is it OK to do it? If it is not ok, can you upload a file named sts-dev.tsv?

  2. data_files
    I get the error when loading dataset.
    So I replaced the code for loading data_files with the lines from SimCSE github as below.
    Is it OK to do it? I wonder if there is special reasons to use "datasets_cache/text.py"?

%
data_files = {}
if data_args.train_file is not None:
data_files["train"] = data_args.train_file
extension = data_args.train_file.split(".")[-1]
if extension == "txt":
extension = "text"
if extension == "csv":
datasets = load_dataset(extension, data_files=data_files, cache_dir="./data/", delimiter="\t" if "tsv" in data_args.train_file else ",")
else:
datasets = load_dataset(extension, data_files=data_files, cache_dir="./data/")
%

By doing these, I acquired a test performance of STS Avg. 76.23 (seed=42).
I think different versions of libraries such as transformers and pytorch could be the reason.
I will appreciate a lot if you can let me know how much performance did you get when running your run_unsup_example.sh code (seed=42) for comparison.
Thanks for the works again.

"datasets_cache/text.py"

Hello,
When I run the code to train the model, I encounter this error:
image

So, I would like to know the specific location of the file "datasets_cache/text.py"

Paper

hello, I didn't find this paper, could you give me a link?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.