Git Product home page Git Product logo

amrcoref's People

Contributors

sean-blank avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

trellixvulnteam

amrcoref's Issues

Attribute error: 'penman' has no attribute 'AMRCodec'

This above error showed up when I attempted to run 'prepare_data.py'
I checked the version of penman, but its the same version as 'pipreqs' says was used in the codebase. (I ran pipreqs to check for version used in your code; and it is the same as the version I have)
Please help

Unclear data format

When running python train.py, it says that:
FileNotFoundError: [Errno 2] No such file or directory: './data/corpora_base/evl'

Dataloader needs to load data in JSON format, but not very clear on how to generate/preprocess from the raw LDC2020T02 data. What the JSON data should look like? And what should be done to get the code running given the LDC2020T02 data txt format?

Assertion failed on "assert len(amr.nodes) == len(concept) == len(amr.node_values)"

I used penman 0.6.0 do run prepare_data.py

Then when I ran preprocess.py, I got:

Traceback (most recent call last): File "preprocess.py", line 354, in <module> item = pre_to_json(data_per_doc, links_per_doc, file_name) File "preprocess.py", line 307, in pre_to_json data, cluster = get_clusters_info(links, data) File "preprocess.py", line 260, in get_clusters_info pdb.set_trace() File "preprocess.py", line 240, in mapping_edges assert len(amr.nodes) == len(concept) == len(amr.node_values) AssertionError

Tokenizer problem while training using train.py

I am using the preprocessed data to train and facing the following issue. I understand that it is issue with not finding the model. Any way this can be solved?

PU available: True CuDNN: True
Using GPU To Train... GPU ID: 1
Log file path: ./ckpt/coref.amr.log
Model name './data/bert-base-cased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed './data/bert-base-cased' was a path or url but couldn't find any file associated to this path or url.
load train data
Traceback (most recent call last):
File "train.py", line 165, in
train(args)
File "train.py", line 34, in train
train_data, dev_data, test_data, vocabs = make_data(args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 504, in make_data
train_data = load_json(args.train_data, args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 156, in load_json
token_bert_ids = get_bert_ids(tokens, args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 93, in get_bert_ids
for char in tokenizer.tokenize(word):
AttributeError: 'NoneType' object has no attribute 'tokenize'

Reproducing reported numbers

Hello,

I ran the training and eval script for one time and got F1 at around 54, which is much lower than the reported number in the paper, 62. Is that number obtained by choosing the best across multiple runs? If yes, how many times do you have to run?

Data preparation/preprocessing

Hi, I'm trying to prepare data (we bought an AMR3.0 licence).
But I failed to run preprocess.py. It seems that data/align_unsplit does not exist. Which files must I copy there? In your data/ there are three msamr_dfb_*.align, but I do not have these files in my AMR3.0 data. Are these renamed files or produced by a script on the AMR3 data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.