Git Product home page Git Product logo

bert-fine-tuning-analysis's Introduction

BERT-fine-tuning-analysis

The codebase for the paper: A Closer Look at How Fine-tuning Changes BERT.

Installing

This codebase is dervied from the DirectProbe, following the same install instructions as DirectProbeCode.

Getting Started

Download datasets and Running examples

  1. Download the pre-packed data from here and unzip them. The data format is the same as DirectProbeCode.

  2. Suppose all the pre-packed data is put in the directory data, then we can run an experiment using the configuration from config.ini.

        python main.py
    

Results

After probing, you will find the results in the directory results/SS/.(We are using the supersense role task as the example.) In this directory, there are 6 files:

  • clusters.txt: The clustering results. Each line contains a cluster number for the corresponding training example.

  • inside_max.txt: The maximum pairwise distances inside each cluster. Each line represents one cluster.

  • inside_mean.txt: The mean pairwise distances inside each cluster. Each line represents one cluster.

  • log.txt: The probing log file.

  • outside_min.txt: The minimum distance to other clusters for each cluster. Each line represents one cluster.

  • vec.txt: Pairwise distances between clusters. Each line represents a pair of cluster and its distance.

Citations

@inproceedings{zhou-srikumar-2022-closer,
    title = "A Closer Look at How Fine-tuning Changes {BERT}",
    author = "Zhou, Yichu  and
      Srikumar, Vivek",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.75",
    doi = "10.18653/v1/2022.acl-long.75",
    pages = "1046--1061",
    abstract = "Given the prevalence of pre-trained contextualized representations in today{'}s NLP, there have been many efforts to understand what information they contain, and why they seem to be universally successful. The most common approach to use these representations involves fine-tuning them for an end task. Yet, how fine-tuning changes the underlying embedding space is less studied. In this work, we study the English BERT family and use two probing techniques to analyze how fine-tuning changes the space. We hypothesize that fine-tuning affects classification performance by increasing the distances between examples associated with different labels. We confirm this hypothesis with carefully designed experiments on five different NLP tasks. Via these experiments, we also discover an exception to the prevailing wisdom that {``}fine-tuning always improves performance{''}. Finally, by comparing the representations before and after fine-tuning, we discover that fine-tuning does not introduce arbitrary changes to representations; instead, it adjusts the representations to downstream tasks while largely preserving the original spatial structure of the data points.",
}

bert-fine-tuning-analysis's People

Contributors

flyaway1217 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

qyb156 vtyh sabo30

bert-fine-tuning-analysis's Issues

What do you mean by " We apply DIRECTPROBE on the training and test set separately"?

Hi,

in your paper, A Closer Look at How FIne-tuning Changes BERT, it is written that "We apply DIRECTPROBE on the
training and test set separately" in section 4.1.
For DirectProbing, we need train and test set. Then, does it mean that you split training set into train/test and test set into train/test set too?
Or just to use training as a test set too?

Thanks. :D

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.