sean-blank / amrcoref Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 1.0 268 KB

Python 100.00%

amrcoref's People

Contributors

Stargazers

Forkers

trellixvulnteam

amrcoref's Issues

Attribute error: 'penman' has no attribute 'AMRCodec'

This above error showed up when I attempted to run 'prepare_data.py'
I checked the version of penman, but its the same version as 'pipreqs' says was used in the codebase. (I ran pipreqs to check for version used in your code; and it is the same as the version I have)
Please help

Unclear data format

When running python train.py, it says that:
FileNotFoundError: [Errno 2] No such file or directory: './data/corpora_base/evl'

Dataloader needs to load data in JSON format, but not very clear on how to generate/preprocess from the raw LDC2020T02 data. What the JSON data should look like? And what should be done to get the code running given the LDC2020T02 data txt format?

Assertion failed on "assert len(amr.nodes) == len(concept) == len(amr.node_values)"

I used penman 0.6.0 do run prepare_data.py

Then when I ran preprocess.py, I got:

Traceback (most recent call last): File "preprocess.py", line 354, in <module> item = pre_to_json(data_per_doc, links_per_doc, file_name) File "preprocess.py", line 307, in pre_to_json data, cluster = get_clusters_info(links, data) File "preprocess.py", line 260, in get_clusters_info pdb.set_trace() File "preprocess.py", line 240, in mapping_edges assert len(amr.nodes) == len(concept) == len(amr.node_values) AssertionError

Tokenizer problem while training using train.py

I am using the preprocessed data to train and facing the following issue. I understand that it is issue with not finding the model. Any way this can be solved?

PU available: True CuDNN: True
Using GPU To Train... GPU ID: 1
Log file path: ./ckpt/coref.amr.log
Model name './data/bert-base-cased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed './data/bert-base-cased' was a path or url but couldn't find any file associated to this path or url.
load train data
Traceback (most recent call last):
File "train.py", line 165, in
train(args)
File "train.py", line 34, in train
train_data, dev_data, test_data, vocabs = make_data(args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 504, in make_data
train_data = load_json(args.train_data, args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 156, in load_json
token_bert_ids = get_bert_ids(tokens, args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 93, in get_bert_ids
for char in tokenizer.tokenize(word):
AttributeError: 'NoneType' object has no attribute 'tokenize'

Reproducing reported numbers

Hello,

I ran the training and eval script for one time and got F1 at around 54, which is much lower than the reported number in the paper, 62. Is that number obtained by choosing the best across multiple runs? If yes, how many times do you have to run?

Data preparation/preprocessing

Hi, I'm trying to prepare data (we bought an AMR3.0 licence).
But I failed to run preprocess.py. It seems that data/align_unsplit does not exist. Which files must I copy there? In your data/ there are three msamr_dfb_*.align, but I do not have these files in my AMR3.0 data. Are these renamed files or produced by a script on the AMR3 data?

sean-blank / amrcoref Goto Github PK

amrcoref's People

Contributors

Stargazers

Forkers

amrcoref's Issues

Attribute error: 'penman' has no attribute 'AMRCodec'

Unclear data format

Assertion failed on "assert len(amr.nodes) == len(concept) == len(amr.node_values)"

Tokenizer problem while training using train.py

Reproducing reported numbers

Data preparation/preprocessing

In prepare_data.py, did nothing with align_path

Assertion failed on "assert len(amr.nodes) == len(concept) == len(amr.node_values)"

mat1 and mat2 shapes cannot be multiplied

No version specified for transformers and other packages.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent