Git Product home page Git Product logo

Comments (8)

mrdrozdov avatar mrdrozdov commented on August 21, 2024 2

Can I train the model on WSJ instead of NLI?

You can do this definitely (this is what I typically do). You can also use other data_types as described in the README:

  • txt - One space-delimited sentence per line.
  • txt_id - One space-delimited sentence per line, and the first token is the example id.

how does this program know what is the gold parse tree for the corresponding sentence?

We don't provide any evaluation code at this time. Only parsing code. If needed, this repo https://github.com/harvardnlp/urnng#evaluation describes how to perform evaluation using EVALB.

from diora.

mrdrozdov avatar mrdrozdov commented on August 21, 2024 1

Our WSJ results were achieved with training on NLI. The training hyperparams are described here: https://github.com/iesl/diora#training

python -m torch.distributed.launch --nproc_per_node=4 diora/scripts/train.py \
    --arch treelstm \
    --batch_size 128 \
    --data_type nli \
    --elmo_cache_dir ~/data/elmo \
    --emb elmo \
    --hidden_dim 400 \
    --k_neg 100 \
    --log_every_batch 100 \
    --lr 2e-3 \
    --normalize unit \
    --reconstruct_mode margin \
    --save_after 1000 \
    --train_filter_length 20 \
    --train_path ~/data/allnli.jsonl \
    --cuda --multigpu

After training, you can use a saved model checkpoint to parse WSJ. This is also in the README, but if you have a more specific question, I can try to answer it: https://github.com/iesl/diora#parsing

Correct! You will need to prepare WSJ in space-delimited format. Unfortunately, the WSJ corpus is not open source.

from diora.

rm-rf0 avatar rm-rf0 commented on August 21, 2024

Can I train the model on WSJ instead of NLI?

To do so, I just need to prepare the WSJ train in NLI format. Is that correct?

from diora.

rm-rf0 avatar rm-rf0 commented on August 21, 2024

And I wonder if the provided WSJ is in space-delimited format, how does this program know what is the gold parse tree for the corresponding sentence?

from diora.

rm-rf0 avatar rm-rf0 commented on August 21, 2024

Thanks a lot! Sorry for the unclarified questions!

from diora.

bobwan1995 avatar bobwan1995 commented on August 21, 2024

Hi, can you get the same result on the WSJ dataset as the paper reports (F1 max 56.76 )? When I directly use the checkpoint the author provided in this repo to evaluate on WSJ, I can only obtain a Corpus F1 around 43 (Here I use the full PTB test set with sentences length larger than 1 and punctuation removed). I don't know whether I need to finetune it with WSJ dataset? Or I miss some details? Thank you so much!

from diora.

mrdrozdov avatar mrdrozdov commented on August 21, 2024

You should parse, and remove punctuation after.

If you are willing to wait a little bit, then I can add the output trees to the repo.

from diora.

bobwan1995 avatar bobwan1995 commented on August 21, 2024

Thank you so much for your kind reply!

You should parse, and remove punctuation after.

Do you mean I should keep the punctuations in the inference stage, and then remove them with postprocessing?

I would be grateful if you could add the output trees to the repo. I'm looking forward to the results!

from diora.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.