Git Product home page Git Product logo

coliee-2023-task4's Introduction

AMHR Lab 2023 COLIEE Competition Approach

Code and data for reproducing our results for the COLIEE 2023 Competition, Task 4.

Installation

All code is based on Python 3.x, we recommend using version 3.9 or higher. Most dependencies can be installed using the requirements.txt. Note that BERTScore and BleuRT have special installation instructions that cannot be handled with just pip, please see the project's respective pages for instructions:

BERTScore: Evaluating Text Generation with BERT

BLEURT: a Transfer Learning-Based Metric for Natural Language Generation

Summary of Files

  1. prompt_tuning.py: All code for prompt-tuning Huggingface and OpenAI models.
  2. dt.py: Implementation of ensemble prompting approach.
  3. master_df.tsv: Used to train ensemble models. For convenience, we merged all our results from the non-ensemble models into a single file for training the meta classifier.
  4. similarity.py: Implementation of shot selection metrics.
  5. xml_processing.ipynb: Converts the raw COLIEE XML data into pandas dataframe for easier processing.

Note on Training Data

All our scripts assume the training data has already been cleaned and stored as tsv files. You must obtain the COLIEE Task 4 training data yourself from the competition organizers, we do not have permission to share this. Once you have done that and run it through xml_processing, make sure to also change file paths in the other scripts to point to where you stored the cleaned splits.

Citation

Information about how to cite our work will be released after the conference.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.