Git Product home page Git Product logo

bertpunc's People

Contributors

nickreinerink avatar nkrnrnk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bertpunc's Issues

Error in train.py

Hi, I think there is a bug in the train.py file, just when the main function starts the variable punctuation_enc is defined twice as you can see below. The second definition needs to be commented out in order to use train the model with LREC dataset.

   punctuation_enc = {
        'O': 0,
        'COMMA': 1,
        'PERIOD': 2,
        'QUESTION': 3
    }

    punctuation_enc = {
        'O': 0,
        'PERIOD': 1,
    }

How the function `insert_target()` in data.py works?

  • Question:

I found if using insert_target() in data.py, the input data will be split to many sequences which have a lot of overlapping words to each other.

I would like to know why process like this? I think it makes a lot of repeating data.

datasets

Could you please upload an example of the datasets you load in train.py, lines 190-192?

Issue running in Colab in April 2020

Just fyi for future users, I got this code running in April 2020 in a colab notebook by reverting to some earlier versions of libraries. I'm not sure what was originally used, so I was guessing based on the original code being from ~March 2019.
!pip install -q torch==1.0.0 torchvision==0.2.0
!pip install pytorch_pretrained_bert==0.5.0

Warning! I don't know if it actually worked as a match to the original experiment since I don't have an exact dataset match.

Pre-trained weights

Hi,
As this model is fine-tuned on a pretrained reimplementation of BERT. Can anyone please share model weights generated after this fine tuning experiment.

Thanks in advance!

Missing dataset

Would it be possible to include one of the datasets? It's difficult to tell the data format without an included dataset.

Results

I tried to reproduce results using your Jupyter notebook but for some reasons, I got only:

ย  COMMA PERIOD QUESTION OVERALL
0.062041 0.063562 0.001647 0.042417
0.307018 0.231150 0.171429 0.236532
0.103223 0.099707 0.003264 0.068731

(for test2011asr)

Could you please tell me why the results are so bad?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.