We note that a recent work from NTU also focuses on pre-trained models for code change. Since these works are conducted at similar time, we missed to discuss this paper in our Related Work part. We are sorry for that and we hope readers could also pay attention to this paper.
pytorch=2.0.0;
torchvision=0.15.1;
torchaudio;
datasets==1.16.1;
transformers==4.21.1;
tensorboard==2.12.2;
tree-sitter==0.19.1;
nltk=3.8.1;
scipy=1.10.1;
Install the above requirements manully or execute the following script:
bash scripts/setup.sh
- Download the dataset and models:
bash scripts/download.sh
- Prepare the dataset for pre-training[optional]
bash scripts/prepare_dataset.sh
bash scripts/pre-train.sh -g [GPU_ID]
bash scripts/finetune_msggen.sh -g [GPU_ID] -l [cpp/csharp/java/javascript/python/fira]
The released checkpoint may performs better than stated in the paper. If the evaluation during fine-tuning takes too long, you can adjust the "--evaluate_sample_size" parameter. This parameter refers to the number of cases in the validation set during evaluation.
To evaluate the performance of a specific checkpoint, add the flag "-e" followed by the checkpoint path:
bash scripts/finetune_msggen.sh -g [GPU_ID] -l [cpp/csharp/java/javascript/python/fira] -e [path_to_model]
Note that if [path_to_model] is blank, this script will automatically evaluate our released checkpoint.
bash scripts/finetune_cup.sh -g [GPU_ID]
To evaluate a specific checkpoint like in Task 1, add the flag "-e" followed by the checkpoint path.
Additionally, we have released the the output result of CCT5 and baselines, which is stored at results/CommentUpdate
. Execute the following script and assign the path_to_result_file
to evaluate its effectiveness:
bash scripts/eval_cup_res.sh --filepath [path_to_result_file]
Fine-tune:
bash scripts/finetune_jitdp_SF.sh -g [GPU_ID]
Evaluate:
bash scripts/finetune_jitdp_SF.sh -g [GPU_ID] -e [path_to_model]
Fine-tune:
bash scripts/finetune_jitdp_SF_EF.sh -g [GPU_ID]
Evaluate:
bash scripts/finetune_jitdp_SF_EF.sh -g [GPU_ID] -e [path_to_model]
Fine-tune:
bash scripts/finetune_QE.sh -g [GPU_ID]
Evaluate:
bash scripts/finetune_QE.sh -g [GPU_ID] -e [path_to_model]
Fine-tune:
bash scripts/finetune_CodeReview.sh -g [GPU_ID]
Evaluate:
bash scripts/finetune_CodeReview.sh -g [GPU_ID] -e [path_to_model]
We reused some code from open-source repositories. We would like to extend our gratitude to the following repositories:
@inproceedings{lin2023cct5,
title={CCT5: A Code-Change-Oriented Pre-Trained Model},
author={Lin, Bo and Wang, Shangwen and Liu, Zhongxin and Liu, Yepang and Xia, Xin and Mao, Xiaoguang},
booktitle={Proceedings of the 31th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
year={2023}
}
cct5's People
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.