Git Product home page Git Product logo

Comments (1)

bdzyubak avatar bdzyubak commented on September 2, 2024

DistilBERT was fine tuned to achieve 0.8 training accuracy and 0.68 validation accuracy. The peak validation loss accuracy was achieved in one epoch after which point validation performance deteriorated severely as the model overfit the training data and lost its generalized pre-trained weights. Due to checkpointing, the best model at the first epoch was saved, so we don't need to worry about the later epochs.

training_metrics_version_7

The validation performance is relatively poor, and much lower than the training performance. A larger model might fit the training data better than 80% solving the underfitting issue. On the other hand, I expect even training accuracy to be limited by the observation below. The overfitting may be addressed by freezing most of the network layers and training just the detection head.

The data labels have been augmented by heavily resampling each review into smaller chunks down to one letter. The smaller chunks inherit the label of the original review, so the target sentiment for "A" and "A series", "occasionally amuses" and "none of which amounts to much of a story" all map to the label of the combination of these. Without more intelligent splitting, this may cap the ability of the network to learn sentiments as datapoints like "A"/"A series" will have variable labels.

Next steps:

  1. The IMDB dataset is also interesting for sentiment review. Potentially, implement as a separate experiment and then cross validate training on one or both.
  2. Implement the other common networks and compare performance. For those that come with out-of-the-box sentiment analysis, evaluate performance without fine-tuning.
  3. Compare fine-tuning with frozen layers and only the sentiment head being allowed to train.

from torch-control.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.