Git Product home page Git Product logo

ise-uiuc / repilot Goto Github PK

View Code? Open in Web Editor NEW
125.0 4.0 9.0 981 KB

Repilot, a patch generation tool introduced in the ESEC/FSE'23 paper "Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair"

License: Apache License 2.0

Python 99.42% Dockerfile 0.58%
code-generation program-repair large-language-models code-completion program-synthesis

repilot's Introduction

βš™οΈ$\mathbb{R}\mathrm{e}\mathbf{pilot}$πŸ› οΈ

Welcome to the source code repo of Repilot, a patch generation tool introduced in our ESEC/FSE'23 paper "Copiloting the Copilot: Fusing Large Language Models with Completion Engines for Automated Program Repair"!

Repilot Demo

Repilot leverages the synergy between a semantics-based code completion engine and an auto-regressive large language model for more efficient valid patch generation.

Important

Repilot is implemented for Java patch generation as a complex hybrid system combining a Modified Eclipse JDT Language Server and Python's huggingface/transformers interface for manipulating large language models. Correctly setting up the dependencies and configurations of Repilot can be non-trivial. Therefore, we highly recommend directly using our out-of-the-box Docker image.

πŸš€ Quick start with Repilot's Docker image

# Pull the image and run a container.
# This may take some time...
docker run -it --name repilot universefly/repilot:latest
# Now you will get into a "virtual environment" provided by Docker
# Enter the `Repilot` directory
cd /root/Repilot
# This is important because Repilot relies on a `meta_config.json` file to work properly
cat meta_config.json
# Generate patches with the full Repilot approach using CodeT5
ACTIVE=1 python -m repilot.cli.main repair -b "Chart-9" --method pruned-mem -d chart-9-repilot -n 5
# You will see logs about the patch generation and which tokens are accepted/rejected.

# Validate the patch generation
python -m repilot.cli.main validate -d chart-9-repilot

# Print a table of the evaluation results
python -m repilot.cli.main evaluate -d chart-9-repilot
# You'll see something like this:
#                                              Repilot Evaluation Results                                              
# ┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
# ┃ Tag             ┃ Average Gen Time ┃ %Compilable Patches ┃ %Plausible Patches ┃ #Plausible Fixes ┃ #Correct Fixes ┃
# ┑━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
# β”‚ chart-9-repilot β”‚ 1.33s            β”‚ 100.0%              β”‚ 0.000%             β”‚ 0                β”‚ -              β”‚
# β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

️⭐️ Artifact️

For more comprehensive guidance on how to use Repilot and how to reproduce the results in our paper, we greatly encourage you to check out our artifact documentation.

⚠️ How to build and use Repilot from source?

Warning

Building Repilot from source is NOT recommended since there are many complex dependencies and configurations to handle. It is only for advanced users who want to extend Repilot. If you want to build from source, we also encourage you to check out our Dockerfile for more details.

Important

Environment requirements

  • Python 3.10 and Git LFS are required.
  • All three versions of Java 8, 11, and 18 are required. For convenient management of multiple Java versions, we recommend coursier.
  • (Optional) It's recommended to have an NVIDIA GPU with >6G memory for running Repilot with CodeT5 and >30G memory for Incoder-6.7B.
Download and build the modified Eclipse JDT Language Server

Follow the instructions in the repo to build the modified Eclipse JDT Language Server. Note you will need Java 11:

git clone https://github.com/UniverseFly/eclipse.jdt.ls
cd eclipse.jdt.ls
JAVA_HOME=/path/to/java/11 ./mvnw clean verify -DskipTests=true

Adjust the following command according to your build to dry run the language server:

java \
	-Declipse.application=org.eclipse.jdt.ls.core.id1 \
	-Dosgi.bundles.defaultStartLevel=4 \
	-Declipse.product=org.eclipse.jdt.ls.core.product \
	-Dlog.level=ALL \
	-noverify \
	-Xmx1G \
	--add-modules=ALL-SYSTEM \
	--add-opens java.base/java.util=ALL-UNNAMED \
	--add-opens java.base/java.lang=ALL-UNNAMED \
	-jar ./plugins/org.eclipse.equinox.launcher_1.5.200.v20180922-1751.jar \
	-configuration ./config_linux \
	-data /path/to/data

If everything goes well, you can move on to the next step.

Download and install Repilot as a Python package including its dependencies
git clone https://github.com/UniverseFly/Repilot && cd Repilot
# Do an editable install
pip install -e .
# Consider upgrading pip if you encounter any errors, also make sure you are using Python 3.10
# This command should also install all the dependencies of Repilot
Install the Defects4j datasets

Repilot evaluates on the Defects4j dataset. Please checkout to its v2.0.0 release and follow its instructions to install the dataset.

[!WARNING] If you directly download the release instead of doing a checkout you may encounter errors when running Repilot, as Repilot will dump the metadata by collecting the meta information of these projects as Git repos. If they are not Git repos, Repilot may fail.

You can check the installation by running /path/to/defects4j info -p Chart.

Prepare the runtime environment of Repilot

We need to prepare a meta_config.json file for Repilot to work properly. The file should be placed in the root directory of Repilot. Please modify the following template according to your environment and save the file in the root directory of Repilot:

{
  "d4j_home": "/home/yuxiang/Developer/defects4j",
  "d4j_checkout_root": "/home/yuxiang/Developer/d4j-checkout",
  "jdt_ls_repo": "/home/yuxiang/Developer/eclipse.jdt.ls",
  "java8_home": "/home/yuxiang/.cache/coursier/arc/https/github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u181-b13/OpenJDK8U-jdk_x64_linux_hotspot_8u181b13.tar.gz/jdk8u181-b13",
  "language_server_cmd": [
    "/home/yuxiang/.cache/coursier/arc/https/github.com/adoptium/temurin18-binaries/releases/download/jdk-18.0.2%252B9/OpenJDK18U-jdk_x64_linux_hotspot_18.0.2_9.tar.gz/jdk-18.0.2+9/bin/java",
    "-Declipse.application=org.eclipse.jdt.ls.core.id1",
    "-Dosgi.bundles.defaultStartLevel=4",
    "-Declipse.product=org.eclipse.jdt.ls.core.product",
    "-Dlog.level=ERROR",
    "-noverify",
    "-Xmx1G",
    "--add-modules=ALL-SYSTEM",
    "--add-opens",
    "java.base/java.util=ALL-UNNAMED",
    "--add-opens",
    "java.base/java.lang=ALL-UNNAMED",
    "-jar",
    "/home/yuxiang/Developer/eclipse.jdt.ls/org.eclipse.jdt.ls.product/target/repository/plugins/org.eclipse.equinox.launcher_1.6.400.v20210924-0641.jar",
    "-configuration",
    "/home/yuxiang/Developer/eclipse.jdt.ls/org.eclipse.jdt.ls.product/target/repository/config_linux"
  ],
  "seed": 0
}

Now let's cd back to the root directory of Repilot, and run the following command to checkout all the Defects4J bugs:

python -m repilot.cli.init
Do an example run
# Generate patches with the full Repilot approach using CodeT5
ACTIVE=1 python -m repilot.cli.main repair -b "Chart-9" --method pruned-mem -d chart-9-repilot -n 5 # You will see logs about the patch generation and which tokens are accepted/rejected.

# Validate the patch generation
python -m repilot.cli.main validate -d chart-9-repilot

# Print a table of the evaluation results
python -m repilot.cli.main evaluate -d chart-9-repilot

You will see a table of evaluation results if everything goes well.

(Optional) Unpack the pre-generated patches

The GitHub repo also contains pre-generated patches for the experiments in our paper. You can unpack if you would like to check them. First, make sure you cd to the root directory of Repilot. Then run the following command:

tar -xvf ./data/large.tar.xz

Then you will see the data/large directory is populated with the pre-generated patches.

πŸ”₯πŸ”₯Congratulations! You have successfully built and used Repilot from source!πŸ”₯πŸ”₯

repilot's People

Contributors

dependabot[bot] avatar universefly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.