amrcoref's People
Forkers
trellixvulnteamamrcoref's Issues
Attribute error: 'penman' has no attribute 'AMRCodec'
This above error showed up when I attempted to run 'prepare_data.py'
I checked the version of penman, but its the same version as 'pipreqs' says was used in the codebase. (I ran pipreqs to check for version used in your code; and it is the same as the version I have)
Please help
Unclear data format
When running python train.py
, it says that:
FileNotFoundError: [Errno 2] No such file or directory: './data/corpora_base/evl'
Dataloader needs to load data in JSON format, but not very clear on how to generate/preprocess from the raw LDC2020T02 data. What the JSON data should look like? And what should be done to get the code running given the LDC2020T02 data txt format?
Assertion failed on "assert len(amr.nodes) == len(concept) == len(amr.node_values)"
I used penman 0.6.0 do run prepare_data.py
Then when I ran preprocess.py, I got:
Traceback (most recent call last): File "preprocess.py", line 354, in <module> item = pre_to_json(data_per_doc, links_per_doc, file_name) File "preprocess.py", line 307, in pre_to_json data, cluster = get_clusters_info(links, data) File "preprocess.py", line 260, in get_clusters_info pdb.set_trace() File "preprocess.py", line 240, in mapping_edges assert len(amr.nodes) == len(concept) == len(amr.node_values) AssertionError
Tokenizer problem while training using train.py
I am using the preprocessed data to train and facing the following issue. I understand that it is issue with not finding the model. Any way this can be solved?
PU available: True CuDNN: True
Using GPU To Train... GPU ID: 1
Log file path: ./ckpt/coref.amr.log
Model name './data/bert-base-cased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed './data/bert-base-cased' was a path or url but couldn't find any file associated to this path or url.
load train data
Traceback (most recent call last):
File "train.py", line 165, in
train(args)
File "train.py", line 34, in train
train_data, dev_data, test_data, vocabs = make_data(args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 504, in make_data
train_data = load_json(args.train_data, args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 156, in load_json
token_bert_ids = get_bert_ids(tokens, args, tokenizer)
File "/home/researcher/Trina/SeanBlankAMRcoref-main/dataloader.py", line 93, in get_bert_ids
for char in tokenizer.tokenize(word):
AttributeError: 'NoneType' object has no attribute 'tokenize'
Reproducing reported numbers
Hello,
I ran the training and eval script for one time and got F1 at around 54, which is much lower than the reported number in the paper, 62. Is that number obtained by choosing the best across multiple runs? If yes, how many times do you have to run?
Data preparation/preprocessing
Hi, I'm trying to prepare data (we bought an AMR3.0 licence).
But I failed to run preprocess.py
. It seems that data/align_unsplit
does not exist. Which files must I copy there? In your data/ there are three msamr_dfb_*.align
, but I do not have these files in my AMR3.0 data. Are these renamed files or produced by a script on the AMR3 data?
In prepare_data.py, did nothing with align_path
I find that the code in prepare_data.py , did nothing with align_path , so no file generate under align_unsplit.
Assertion failed on "assert len(amr.nodes) == len(concept) == len(amr.node_values)"
可以发处理好的数据给我吗,我也遇到了和之前提问者相同的问题,我学校有LDC的license,可以的话请发到[email protected],谢谢
mat1 and mat2 shapes cannot be multiplied
Hello, when I try to use Bert, error comes out that mat1 and mat2 shapes cannot be multiplied, so I want to ask how you run the code using Bert?
No version specified for transformers and other packages.
Please provide the exact versions for the packages like transformers, etc. Only python and torch have been provided.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.