Light

bdbc-kg-nlp / mixcse_aaai2022 Goto Github PK

View Code? Open in Web Editor NEW

21.0 3.0 4.0 61 KB

Code for AAAI 2022 paper Unsupervised Sentence Representation via Contrastive Learning with Mixing Negatives

Python 99.06% Shell 0.94%

mixcse_aaai2022's Introduction

MixCSE_AAAI2022

A PyTorch implementation for our paper "Unsupervised Sentence Representation via Contrastive Learning with Mixing Negatives".

You can download the paper from here.

Abstract

Unsupervised sentence representation learning is a fundamental problem in natural language processing. Recently, contrastive learning has made great success on this task. Existing constrastive learning based models usually apply random sampling to select negative examples for training. Previous work in computer vision has shown that hard negative examples help contrastive learning to achieve faster convergency and better optimization for representation learning. However, the importance of hard negatives in contrastive learning for sentence representation is yet to be explored. In this study, we prove that hard negatives are essential for main�taining strong gradient signals in the training process while random sampling negative examples is ineffective for sentence representation. Accordingly, we present a contrastive model, MixCSE, that extends the current state-of-the-art SimCSE by continually constructing hard negatives via mixing both positive and negative features. The superior performance of the proposed approach is demonstrated via empirical studies on Semantic Textual Similarity datasets and Transfer task datasets

Requirement

Python = 3.7
torch = 1.11.0
numpy = 1.17.2
transformers = 4.19.2

train

bash run_unsup_example.sh

evaluate

python evaluation.py \
    --model_name_or_path trained_model \
    --pooler cls \
    --task_set sts \
    --mode test

Citation

If this work is helpful, please cite as:

@article{zhang2022unsupervised,
  title={Unsupervised Sentence Representation via Contrastive Learning with Mixing Negatives},
  author={Zhang, Yanzhao and Zhang, Richong and Mensah, Samuel and Liu, Xudong and Mao, Yongyi},
  year={2022}
}

License

MIT

mixcse_aaai2022's People

Stargazers

Watchers

Forkers

mpanpan githubxin123 a868111817 smwongela

mixcse_aaai2022's Issues

Reproducibility

Hello, thanks for presenting the codes for your work of MixCSE.
While running your codes, I found some issues.
I succeeded in running the codes by changing some lines, but being not sure about changes.
I wonder if these changes are fine.

data/sts-dev.tsv
Because I could not find the files named sts-dev.tsv in this repository, I changed the code to use SimCSE/SentEval/data/downstream/STS/STSBenchmark/sts-dev.csv that I can download from the SimCSE github (https://github.com/princeton-nlp/SimCSE) when running the train.py file.
I also changed the default argument of eval_path in evaluation.py.
Is it OK to do it? If it is not ok, can you upload a file named sts-dev.tsv?
data_files
I get the error when loading dataset.
So I replaced the code for loading data_files with the lines from SimCSE github as below.
Is it OK to do it? I wonder if there is special reasons to use "datasets_cache/text.py"?

%
data_files = {}
if data_args.train_file is not None:
data_files["train"] = data_args.train_file
extension = data_args.train_file.split(".")[-1]
if extension == "txt":
extension = "text"
if extension == "csv":
datasets = load_dataset(extension, data_files=data_files, cache_dir="./data/", delimiter="\t" if "tsv" in data_args.train_file else ",")
else:
datasets = load_dataset(extension, data_files=data_files, cache_dir="./data/")
%

By doing these, I acquired a test performance of STS Avg. 76.23 (seed=42).
I think different versions of libraries such as transformers and pytorch could be the reason.
I will appreciate a lot if you can let me know how much performance did you get when running your run_unsup_example.sh code (seed=42) for comparison.
Thanks for the works again.

why z4_z2_cos not z2_z4_cos ? I think the latter is more appropriate.

"datasets_cache/text.py"

Hello,
When I run the code to train the model, I encounter this error:

So, I would like to know the specific location of the file "datasets_cache/text.py"

hello,i didn't find the loss with the hard negative

experimental result

The result I got was only 65. I don't know what was wrong.

Paper

hello, I didn't find this paper, could you give me a link?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.