Git Product home page Git Product logo

slang-detection's Introduction

Model Enhancement with Data Augmentation for Slang Detection

Computational Semantics Project, ETH Zürich

Environment Build Up and Activate

For Linux system, use the command below to build up the conda environment and then activate the environment.

conda env create -f environment.yml
conda activate csnlp

We can use the pipeline.py file to process different parts of the project.

Data extraction

Use the file data_processing.py

Data Augmentation

method: GPT2 + top_k/top_p

metrics: bleu, perplexity, frequency

to get the final augmentation examples.

python pipeline.py --pipeline final_augmentation

Mask Language Modelling

run rs_mlm.py for random search for parameters for MLM.

run run_mlm.py for final mask language modellig adpater training and storing.

Classification

Baseline Model

run rs_cls_baseline.py for random search baseline.

run cls_baseline_base.py and cls_baseline_mini.py for final baseline.

Enhanced Model

run rs_cls_enhanced.py for random search enhanced model.

run cls_enhanced_base.py and cls_enhanced_mini.py for final enhanced model.

slang-detection's People

Contributors

feichilu avatar junlingwang0512 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.