Git Product home page Git Product logo

ibm / transition-amr-parser Goto Github PK

View Code? Open in Web Editor NEW
239.0 13.0 44.0 6.41 MB

SoTA Abstract Meaning Representation (AMR) parsing with word-node alignments in Pytorch. Includes checkpoints and other tools such as statistical significance Smatch.

License: Apache License 2.0

Python 86.57% Shell 13.32% Dockerfile 0.11%
machine-learning nlp semantic-parsing amr amr-parser abstract-meaning-representation amr-parsing amr-graphs

transition-amr-parser's Introduction

Transition-based Neural Parser

State-of-the-Art Abstract Meaning Representation (AMR) parsing, see papers with code. Models both distribution over graphs and aligments with a transition-based approach. Parser supports generic text-to-graph as long as it is expressed in Penman notation.

Some of the main features

Install Instructions

create and activate a virtual environment with python 3.8, for example

conda create -y -p ./cenv_x86 python=3.8
conda activate ./cenv_x86

or alternatively use virtualenv and pyenv for python versions. Note that all scripts source a set_environment.sh script that you can use to activate your virtual environment as above and set environment variables. If not used, just create an empty version

# or e.g. put inside conda activate ./cenv_x86
touch set_environment.sh

Then install the parser package using pip. You will need to manually install torch-scatter since it is custom built for CUDA. Here we specify the call for torch 1.13.1 and cuda 11.7. See torch-scatter repository to find the appropriate installation instructions.

For MacOS users

(Please install the cpu version of torch-scatter; and model training is not fully supported here.)

pip install transition-neural-parser
# for linux users
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.1+cu117.html
# for cpu installation for MacOS
# pip install torch-scatter

If you plan to edit the code, clone and install instead

# clone this repo (see link above), then
cd transition-neural-parser
pip install --editable .
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.13.1+cu117.html

If you want to train a document-level AMR parser you will also need

git clone https://github.com/IBM/docAMR.git
cd docAMR
pip install .
cd ..

Parse with a pretrained model

Here is an example of how to download and use a pretrained AMR parser in Python

from transition_amr_parser.parse import AMRParser

# Download and save a model named AMR3.0 to cache
parser = AMRParser.from_pretrained('AMR3-structbart-L')
tokens, positions = parser.tokenize('The girl travels and visits places')

# Use parse_sentence() for single sentences or parse_sentences() for a batch
annotations, machines = parser.parse_sentence(tokens)

# Print Penman notation
print(annotations)

# Print Penman notation without JAMR, with ISI
amr = machines.get_amr()
print(amr.to_penman(jamr=False, isi=True))

# Plot the graph (requires matplotlib)
amr.plot()

Note that Smatch does not support ISI-type alignments and gives worse results. Set isi=False to remove them.

You can also use the command line to run a pretrained model to parse a file:

amr-parse -c $in_checkpoint -i $input_file -o file.amr

Download models can invoked with -m <config> can be used as well.

Note that Smatch does not support ISI and gives worse results. Use --no-isi to store alignments in ::alignments meta data. Also use --jamr to add JAMR annotations in meta-data. Use --no-isi to store alignments in ::alignments meta data. Also use --jamr to add JAMR annotations in meta-data.

Document-level Parsing

This represents co-reference using :same-as edges. To change the representation and merge the co-referent nodes as in the paper, please refer to the DocAMR repo

from transition_amr_parser.parse import AMRParser

# Download and save the docamr model to cache
parser = AMRParser.from_pretrained('doc-sen-conll-amr-seed42')

# Sentences in the doc
doc = ["Hailey likes to travel." ,"She is going to London tomorrow.", "She will walk to Big Ben when she goes to London."]

# tokenize sentences if not already tokenized
tok_sentences = []
for sen in doc:
    tokens, positions = parser.tokenize(sen)
    tok_sentences.append(tokens)

# parse docs takes a list of docs as input
annotations, machines = parser.parse_docs([tok_sentences])

# Print Penman notation
print(annotations[0])

# Print Penman notation without JAMR, with ISI
amr = machines[0].get_amr()
print(amr.to_penman(jamr=False, isi=True))

# Plot the graph (requires matplotlib)
amr.plot()

To parse a document from the command line the input file $doc_input_file is a text file where each line is a sentence in the document and there is a newline ('\n') separating every doc (even at the end)

amr-parse -c $in_checkpoint --in-doc $doc_input_file -o file.docamr

Available Pretrained Model Checkpoints

The models downloaded using from_pretrained() will be stored to the pytorch cache folder under:

cache_dir = torch.hub._get_torch_home()

This table shows you available pretrained model names to download;

pretrained model name corresponding file name paper beam10-Smatch
AMR3-structbart-L-smpl amr3.0-structured-bart-large-neur-al-sampling5-seed42 (Drozdov et al 2022) PR 82.9 (beam1)
AMR3-structbart-L amr3.0-structured-bart-large-neur-al-seed42 (Drozdov et al 2022) MAP 82.6
AMR2-structbart-L amr2.0-structured-bart-large-neur-al-seed42 (Drozdov et al 2022) MAP 84.0
AMR2-joint-ontowiki-seed42 amr2joint_ontowiki2_g2g-structured-bart-large-seed42 (Lee et al 2022) (ensemble) 85.9
AMR2-joint-ontowiki-seed43 amr2joint_ontowiki2_g2g-structured-bart-large-seed43 (Lee et al 2022) (ensemble) 85.9
AMR2-joint-ontowiki-seed44 amr2joint_ontowiki2_g2g-structured-bart-large-seed44 (Lee et al 2022) (ensemble) 85.9
AMR3-joint-ontowiki-seed42 amr3joint_ontowiki2_g2g-structured-bart-large-seed42 (Lee et al 2022) (ensemble) 84.4
AMR3-joint-ontowiki-seed43 amr3joint_ontowiki2_g2g-structured-bart-large-seed43 (Lee et al 2022) (ensemble) 84.4
AMR3-joint-ontowiki-seed44 amr3joint_ontowiki2_g2g-structured-bart-large-seed44 (Lee et al 2022) (ensemble) 84.4
doc-sen-conll-amr-seed42 both_doc+sen_trainsliding_ws400x100-seed42 82.31/71.8 2

1 Smatch on AMR3.0 sentences

2 Smatch on AMR3.0 Multi-Sentence dataset

contact authors to obtain the trained ibm-neural-aligner. For the ensemble we provide the three seeds. Following fairseq conventions, to run the ensemble just give the three checkpoint paths joined by : to the normal checkpoint argument -c. Note that the checkpoints were trained with the v0.5.1 tokenizer, this reduces performance by 0.1 on v0.5.2 tokenized data.

Note that we allways report average of three seeds in papers while these are individual models. A fast way to test models standalone is

bash tests/standalone.sh configs/<config>.sh

Training a model

You first need to pre-process and align the data. For AMR2.0 do

conda activate ./cenv_x86 # activate parser environment
python scripts/merge_files.py /path/to/LDC2017T10/data/amrs/split/ DATA/AMR2.0/corpora/

You will also need to unzip the precomputed BLINK cache. See issues in this repository to get the cache file (or the link above for IBM-ers).

unzip /path/to/linkcache.zip

To launch train/test use (this will also run the aligner)

bash run/run_experiment.sh configs/amr2.0-structured-bart-large.sh

Training will store and evaluate all checkpoints by default (see config's EVAL_INIT_EPOCH) and select the one with best dev Smatch. This needs a lot of space but you can launch a parallel job that will perform evaluation and delete Checkpoints not in the top 5

bash run/run_model_eval.sh configs/amr2.0-structured-bart-large.sh

you can check training status with

python run/status.py -c configs/amr2.0-structured-bart-large.sh

use --results to check for scores once models are finished.

We include code to launch parallel jobs in the LSF job schedules. This can be adapted for other schedulers e.g. Slurm, see here

Initialize with WatBART

To load WatBART instead of BART just uncomment and provide the path on

initialize_with_watbart=/path/to/checkpoint_best.pt

transition-amr-parser's People

Contributors

ablodge avatar chanind avatar cpendus avatar gangiswag avatar gx-xu avatar jzhou316 avatar kant avatar kjbarker-work avatar mrdrozdov avatar ramon-astudillo avatar sadhana01 avatar stevemar avatar tahira avatar xsthunder avatar ysuklee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transition-amr-parser's Issues

Please publish this on PyPI

This is the current state of the art for AMR parsing, but it's still difficult to integrate into other projects that require AMR parsing since it's not a published module and is built with the assumption that it's just a code demo, rather than a tool to be used to parse AMR. But it looks like it's very close to being able to be publishable, e.g. it has a setup.py that looks correct. I think it just needs to remove the debugger from being always activated by default and it should work as a PyPI module.

If this was published as a PyPI module it would help push the whole AMR field forward, and I'm sure it would result in more citations for your papers too since then others can easily use it in further projects, rather than needing to fall back on easier-to-run but less powerful parsers for AMR like amrlib.

I want to learn how to construct a AMR Parser by myself

Excuse me, I need your help. I want to learn how to construct a AMR Parser,so I reading many papers,I still can not unterstand what the papers express ,even the algorithm.Thus,Do you have any books or papers introduced ?I am new to this kind of work.

Question about setting the environment

Hi!

I ran the command pip install --editable ., however as a response I got some problems, particularly related to the torch version (below the full message that I got). Is there any suggestion on how I might solve this issue?

Building wheels for collected packages: torch-scatter
Building wheel for torch-scatter (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/alabate/transition-amr-parser/venv/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-zovxf2mm/torch-scatter/setup.py'"'"'; file='"'"'/tmp/pip-install-zovxf2mm/torch-scatter/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-9szx92a5
cwd: /tmp/pip-install-zovxf2mm/torch-scatter/
Complete output (27 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/torch_scatter
copying torch_scatter/init.py -> build/lib.linux-x86_64-3.8/torch_scatter
copying torch_scatter/segment_csr.py -> build/lib.linux-x86_64-3.8/torch_scatter
copying torch_scatter/segment_coo.py -> build/lib.linux-x86_64-3.8/torch_scatter
copying torch_scatter/scatter.py -> build/lib.linux-x86_64-3.8/torch_scatter
copying torch_scatter/placeholder.py -> build/lib.linux-x86_64-3.8/torch_scatter
copying torch_scatter/utils.py -> build/lib.linux-x86_64-3.8/torch_scatter
creating build/lib.linux-x86_64-3.8/torch_scatter/composite
copying torch_scatter/composite/init.py -> build/lib.linux-x86_64-3.8/torch_scatter/composite
copying torch_scatter/composite/logsumexp.py -> build/lib.linux-x86_64-3.8/torch_scatter/composite
copying torch_scatter/composite/std.py -> build/lib.linux-x86_64-3.8/torch_scatter/composite
copying torch_scatter/composite/softmax.py -> build/lib.linux-x86_64-3.8/torch_scatter/composite
running build_ext
building 'torch_scatter._version_cpu' extension
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/csrc
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -Icsrc -I/home/alabate/transition-amr-parser/venv/lib/python3.8/site-packages/torch/include -I/home/alabate/transition-amr-parser/venv/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/alabate/transition-amr-parser/venv/lib/python3.8/site-packages/torch/include/TH -I/home/alabate/transition-amr-parser/venv/lib/python3.8/site-packages/torch/include/THC -I/home/alabate/transition-amr-parser/venv/include -I/usr/include/python3.8 -c csrc/version.cpp -o build/temp.linux-x86_64-3.8/csrc/version.o -O2 -DAT_PARALLEL_OPENMP -fopenmp -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_version_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
csrc/version.cpp:1:10: fatal error: Python.h: File or directory nonexistent
1 | #include <Python.h>
| ^~~~~~~~~~
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

ERROR: Failed building wheel for torch-scatter
Running setup.py clean for torch-scatter
Failed to build torch-scatter
ERROR: torchvision 0.14.1 has requirement torch==1.13.1, but you'll have torch 1.10.1 which is incompatible.

Thanks!
Anton

Question about installation

In readme, you say we should build up a set_environment.sh. Also, in some issues, some people think the set_environment.sh is based on our environment. Could you explain how to build up set_environment.sh by myself clearly?

In addition, I have another problem with git checkout <branch>. I do not know which information should I put in this branch. My goal is to use from transition_amr_parser.stack_transformer_amr_parser import AMRParser. I think there is something wrong with my installation. I often receive the error message like AttributeError: module 'torch.utils.data' has no attribute 'IterableDataset'. It looks related to fairseq. Could you tell me the potential problem?

thanks.

AssertionError: current action not in the allowed space? check the rules.

Hi, I'm trying to train the action-transformer model on the AMR2.0 dataset. I follow the README to run bash run/run_experiment.sh configs/amr2.0-action-pointer.sh, but get the following assertion error, what should I do?

Oracle: 36521it [18:55, 32.16it/s]
Base actions:
Counter({'PRED': 7943, 'ENTITY': 717, 'RA': 142, 'LA': 121, 'COPY_SENSE01': 1, 'SHIFT': 1, 'REDUCE': 1, 'COPY_LEMMA': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 417623), ('REDUCE', 256155), ('COPY_LEMMA', 121546), ('COPY_SENSE01', 66826), ('RA(:ARG1)', 59949), ('LA(:ARG0)', 49124), ('LA(root)', 36384), ('LA(:mod)', 32481), ('LA(:ARG1)', 30126), ('RA(:ARG2)', 25345)]
3876 singleton actions
Counter({'PRED': 3504, 'ENTITY': 345, 'LA': 19, 'RA': 8})
Reading DATA/AMR2.0/aligned/cofill//dev.txt
1368 sentences
3875/23797 node types/tokens
109/24019 edge types/tokens
5385/29269 word types/tokens
39/1368 2.9 % repeated sents (max 9 times)
6/1368 0.0044 % inconsistent labelings from repeated sents
Oracle: 1368it [00:44, 30.96it/s]
Base actions:
Counter({'PRED': 1335, 'ENTITY': 135, 'RA': 102, 'LA': 70, 'SHIFT': 1, 'REDUCE': 1, 'COPY_LEMMA': 1, 'COPY_SENSE01': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 18483), ('REDUCE', 10934), ('COPY_LEMMA', 5746), ('COPY_SENSE01', 3236), ('RA(:ARG1)', 2701), ('LA(:ARG0)', 2073), ('LA(:mod)', 1623), ('LA(root)', 1366), ('LA(:ARG1)', 1351), ('MERGE', 1220)]
792 singleton actions
Counter({'PRED': 704, 'ENTITY': 56, 'RA': 20, 'LA': 12})
Reading DATA/AMR2.0/aligned/cofill//test.txt
1371 sentences
3897/24451 node types/tokens
112/25113 edge types/tokens
5364/30054 word types/tokens
20/1371 1.5 % repeated sents (max 7 times)
1/1371 0.0007 % inconsistent labelings from repeated sents
Oracle: 1371it [00:45, 30.11it/s]
Base actions:
Counter({'PRED': 1365, 'ENTITY': 127, 'RA': 103, 'LA': 74, 'SHIFT': 1, 'COPY_SENSE01': 1, 'COPY_LEMMA': 1, 'REDUCE': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 18874), ('REDUCE', 11123), ('COPY_LEMMA', 5787), ('COPY_SENSE01', 3227), ('RA(:ARG1)', 2813), ('LA(:ARG0)', 2205), ('LA(:mod)', 1661), ('MERGE', 1428), ('LA(:ARG1)', 1423), ('LA(root)', 1357)]
829 singleton actions
Counter({'PRED': 745, 'ENTITY': 51, 'LA': 17, 'RA': 16})
[Preprocessing data:]
[Configuration file:]
configs/amr2.0-action-pointer.sh
Cleaning up partially completed DATA/AMR2.0/features/cofill_o8.3_act-states_RoBERTa-large-top24//
Namespace(alignfile=None, batch_normalize_reward=False, bert_layers=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24], bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='DATA/AMR2.0/features/cofill_o8.3_act-states_RoBERTa-large-top24//', embdir='DATA/AMR2.0/embeddings/RoBERTa-large-top24', fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gold_annotations=None, gold_episode_ratio=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', machine_rules=None, machine_type=None, memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer='nag', padding_factor=8, pretrained_embed='roberta.large', seed=1, source_lang='en', srcdict=None, target_lang='actions', task='amr_action_pointer_graphmp', tbmf_wrapper=False, tensorboard_logdir='', testpref='DATA/AMR2.0/oracles/cofill_o8.3_act-states//test', tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, trainpref='DATA/AMR2.0/oracles/cofill_o8.3_act-states//train', user_dir='../fairseq_ext', validpref='DATA/AMR2.0/oracles/cofill_o8.3_act-states//dev', workers=1)
| [en] Dictionary: 33263 types
| [en] DATA/AMR2.0/oracles/cofill_o8.3_act-states//train.en: 36521 sents, 689426 tokens, 0.0% replaced by <unk>
| [en] Dictionary: 33263 types
| [en] DATA/AMR2.0/oracles/cofill_o8.3_act-states//dev.en: 1368 sents, 30637 tokens, 4.13% replaced by <unk>
| [en] Dictionary: 33263 types
| [en] DATA/AMR2.0/oracles/cofill_o8.3_act-states//test.en: 1371 sents, 31425 tokens, 3.74% replaced by <unk>
----------------------------------------------------------------------------------------------------
Generate and process action states information (number of workers: 1):
[English sentence file: DATA/AMR2.0/oracles/cofill_o8.3_act-states//train.en]
[AMR actions file: DATA/AMR2.0/oracles/cofill_o8.3_act-states//train.actions]
 processed 2000 en-actions pairs (time: 4m 41s)Traceback (most recent call last):
  File "fairseq_ext/preprocess_graphmp.py", line 313, in <module>
    cli_main()
  File "fairseq_ext/preprocess_graphmp.py", line 309, in cli_main
    main(args)
  File "fairseq_ext/preprocess_graphmp.py", line 269, in main
    task_obj.build_actions_states_info(en_file, actions_file, out_file_pref, num_workers=args.workers)
  File "/mnt/nfs-storage/transition-amr-parser/fairseq_ext/tasks/amr_action_pointer_graphmp.py", line 365, in build_actions_states_info
    impl='mmap', tokenize=self.tokenize, num_workers=num_workers)
  File "/mnt/nfs-storage/transition-amr-parser/fairseq_ext/amr_spec/action_info_binarize_graphmp.py", line 410, in binarize_actstates_tofile_workers
    actions_offset=0, actions_end=actions_offsets[1])
  File "/mnt/nfs-storage/transition-amr-parser/fairseq_ext/amr_spec/action_info_binarize_graphmp.py", line 114, in binarize
    actions_states = get_actions_states(tokens=tokenize(line), actions=tokenize(actions))
  File "/mnt/nfs-storage/transition-amr-parser/fairseq_ext/amr_spec/action_info_graphmp.py", line 57, in get_actions_states
    assert cano_act in act_allowed, 'current action not in the allowed space? check the rules.'
AssertionError: current action not in the allowed space? check the rules.

problem with installation

Hello,
I followed your setup instructions and installed AMR parser from pip. I checked installation with the python tests/correctly_installed.py and it showed all Ok.

But nevertheless I have two issues:

  1. When I run bash preprocess/install_alignment_tools.sh I get this error:
    Error: Could not retrieve sbt 0.13.5`

  2. When I run bash tests/minimal_test.sh it starts loading my GPU (NVidia Titan RTX) and never stops. In your instructions it is noted that it should not run more than a minute...

Maybe I'm still missing some libraries or my python version is incompatible?

tests/minimal_test.sh: RuntimeError: Function 'LogBackward' returned nan values in its 0th output.

Env:

  1. torch.version '1.2.0a0+afb7a16'
  2. Python 3.6.9 :: Anaconda, Inc.

Solution I've tried:

  1. remove DATA/wiki25 and run tests/minimal_test.sh again
  2. tried running bash run/run_experiment.sh configs/amr2.0-action-pointer.sh and yield the same error. Can't find unzip /path/to/linkcache.zip. python preprocess/merge_files.py /path/to/LDC2017T10/data/amrs/split/ DATA/AMR2.0/corpora/ or bash run/run_experiment.sh configs/amr2.0-action-pointer.sh doesn't generate /path/to/linkcache.zip. linkcache.zip seems to be a memory cache, Am I missing something?

Full Logs:

#bash tests/minimal_test.sh
[Configuration file:]
configs/wiki25.sh
[Building oracle actions:]
[Configuration file:]
configs/wiki25.sh
Directory to aligner: DATA/wiki25/aligned/cofill/ already exists --- do nothing.
[normalize rules] months
[normalize rules] units
[normalize rules] cardinals
[normalize rules] ordinals
Reading DATA/wiki25/aligned/cofill//train.txt
25 sentences
216/293 node types/tokens
35/285 edge types/tokens
241/383 word types/tokens
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
Requirement already satisfied: en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0 in /opt/conda/lib/python3.6/site-packages (2.0.0)

    Linking successful
    /opt/conda/lib/python3.6/site-packages/en_core_web_sm -->
    /opt/conda/lib/python3.6/site-packages/spacy/data/en

    You can now load the model via spacy.load('en')

Oracle: 25it [00:00, 111.70it/s]
Base actions:
Counter({'PRED': 59, 'RA': 20, 'LA': 19, 'ENTITY': 15, 'REDUCE': 1, 'SHIFT': 1, 'COPY_LEMMA': 1, 'COPY_SENSE01': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 198), ('REDUCE', 184), ('COPY_LEMMA', 66), ('MERGE', 26), ('LA(root)', 22), ('COPY_SENSE01', 20), ('LA(:ARG1)', 18), ('RA(:ARG1)', 13), ('LA(:ARG0)', 12), ('PRED(person)', 12)]
76 singleton actions
Counter({'PRED': 55, 'ENTITY': 9, 'RA': 8, 'LA': 4})
Reading DATA/wiki25/aligned/cofill//dev.txt
25 sentences
216/293 node types/tokens
35/285 edge types/tokens
241/383 word types/tokens
Oracle: 25it [00:00, 74.79it/s]
Base actions:
Counter({'PRED': 59, 'RA': 20, 'LA': 19, 'ENTITY': 15, 'REDUCE': 1, 'SHIFT': 1, 'COPY_LEMMA': 1, 'COPY_SENSE01': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 198), ('REDUCE', 184), ('COPY_LEMMA', 66), ('MERGE', 26), ('LA(root)', 22), ('COPY_SENSE01', 20), ('LA(:ARG1)', 18), ('RA(:ARG1)', 13), ('LA(:ARG0)', 12), ('PRED(person)', 12)]
76 singleton actions
Counter({'PRED': 55, 'ENTITY': 9, 'RA': 8, 'LA': 4})
Reading DATA/wiki25/aligned/cofill//test.txt
25 sentences
216/293 node types/tokens
35/285 edge types/tokens
241/383 word types/tokens
Oracle: 25it [00:00, 85.38it/s]
Base actions:
Counter({'PRED': 59, 'RA': 20, 'LA': 19, 'ENTITY': 15, 'REDUCE': 1, 'SHIFT': 1, 'COPY_LEMMA': 1, 'COPY_SENSE01': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 198), ('REDUCE', 184), ('COPY_LEMMA', 66), ('MERGE', 26), ('LA(root)', 22), ('COPY_SENSE01', 20), ('LA(:ARG1)', 18), ('RA(:ARG1)', 13), ('LA(:ARG0)', 12), ('PRED(person)', 12)]
76 singleton actions
Counter({'PRED': 55, 'ENTITY': 9, 'RA': 8, 'LA': 4})
[Preprocessing data:]
[Configuration file:]
configs/wiki25.sh
Cleaning up partially completed DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//
Namespace(alignfile=None, batch_normalize_reward=False, bert_layers=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24], bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//', embdir='DATA/wiki25/embeddings/RoBERTa-large-top24', fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gold_annotations=None, gold_episode_ratio=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', machine_rules=None, machine_type=None, memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer='nag', padding_factor=8, pretrained_embed='roberta.large', seed=1, source_lang='en', srcdict=None, target_lang='actions', task='amr_action_pointer_graphmp', tbmf_wrapper=False, tensorboard_logdir='', testpref='DATA/wiki25/oracles/cofill_o8.3_act-states//test', tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, trainpref='DATA/wiki25/oracles/cofill_o8.3_act-states//train', user_dir='../fairseq_ext', validpref='DATA/wiki25/oracles/cofill_o8.3_act-states//dev', workers=1)
| [en] Dictionary: 247 types
| [en] DATA/wiki25/oracles/cofill_o8.3_act-states//train.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
| [en] Dictionary: 247 types
| [en] DATA/wiki25/oracles/cofill_o8.3_act-states//dev.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
| [en] Dictionary: 247 types
| [en] DATA/wiki25/oracles/cofill_o8.3_act-states//test.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
----------------------------------------------------------------------------------------------------
Generate and process action states information (number of workers: 1):
[English sentence file: DATA/wiki25/oracles/cofill_o8.3_act-states//train.en]
[AMR actions file: DATA/wiki25/oracles/cofill_o8.3_act-states//train.actions]
processing ... 
finished !
Processed data saved to path with prefix: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions
Total time elapsed: 0s
----------------------------------------------------------------------------------------------------
| [actions] DATA/wiki25/oracles/cofill_o8.3_act-states//train.actions_nopos: 25 sents, 796 tokens, 0.0% replaced by <unk>
----------------------------------------------------------------------------------------------------
Generate and process action states information (number of workers: 1):
[English sentence file: DATA/wiki25/oracles/cofill_o8.3_act-states//dev.en]
[AMR actions file: DATA/wiki25/oracles/cofill_o8.3_act-states//dev.actions]
processing ... 
finished !
Processed data saved to path with prefix: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions
Total time elapsed: 0s
----------------------------------------------------------------------------------------------------
| [actions] DATA/wiki25/oracles/cofill_o8.3_act-states//dev.actions_nopos: 25 sents, 796 tokens, 0.0% replaced by <unk>
----------------------------------------------------------------------------------------------------
Generate and process action states information (number of workers: 1):
[English sentence file: DATA/wiki25/oracles/cofill_o8.3_act-states//test.en]
[AMR actions file: DATA/wiki25/oracles/cofill_o8.3_act-states//test.actions]
processing ... 
finished !
Processed data saved to path with prefix: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//test.en-actions.actions
Total time elapsed: 0s
----------------------------------------------------------------------------------------------------
| [actions] DATA/wiki25/oracles/cofill_o8.3_act-states//test.actions_nopos: 25 sents, 796 tokens, 0.0% replaced by <unk>
Using cache found in /root/.cache/torch/hub/pytorch_fairseq_master
loading archive file http://dl.fbaipublicfiles.com/fairseq/models/roberta.large.tar.gz from cache at /root/.cache/torch/pytorch_fairseq/83e3a689e28e5e4696ecb0bbb05a77355444a5c8a3437e0f736d8a564e80035e.c687083d14776c1979f3f71654febb42f2bb3d9a94ff7ebdfe1ac6748dba89d2
| dictionary: 50264 types
Using roberta.large extraction in GPU

Using cache found in /root/.cache/torch/hub/pytorch_fairseq_master
loading archive file http://dl.fbaipublicfiles.com/fairseq/models/roberta.large.tar.gz from cache at /root/.cache/torch/pytorch_fairseq/83e3a689e28e5e4696ecb0bbb05a77355444a5c8a3437e0f736d8a564e80035e.c687083d14776c1979f3f71654febb42f2bb3d9a94ff7ebdfe1ac6748dba89d2
| dictionary: 50264 types
Using roberta.large extraction in GPU

Using cache found in /root/.cache/torch/hub/pytorch_fairseq_master
loading archive file http://dl.fbaipublicfiles.com/fairseq/models/roberta.large.tar.gz from cache at /root/.cache/torch/pytorch_fairseq/83e3a689e28e5e4696ecb0bbb05a77355444a5c8a3437e0f736d8a564e80035e.c687083d14776c1979f3f71654febb42f2bb3d9a94ff7ebdfe1ac6748dba89d2
| dictionary: 50264 types
Using roberta.large extraction in GPU

| Wrote preprocessed oracle data to DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//
| Wrote preprocessed embedding data to DATA/wiki25/embeddings/RoBERTa-large-top24
[Training:]
[Configuration file:]
configs/wiki25.sh
Namespace(activation_dropout=0.0, activation_fn='relu', adam_betas='(0.9,0.98)', adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, append_eos_to_target=0, apply_tgt_actnode_masks=0, apply_tgt_input_src=0, apply_tgt_src_align=1, apply_tgt_vocab_masks=1, arch='transformer_tgt_pointer_graphmp', attention_dropout=0.0, bert_backprop=False, best_checkpoint_metric='loss', bpe=None, bucket_cap_mb=25, clip_norm=0.0, collate_tgt_states=1, cpu=False, criterion='label_smoothed_cross_entropy_pointer', curriculum=0, data='DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//', dataset_impl=None, ddp_backend='c10d', decoder_attention_heads=4, decoder_embed_dim=256, decoder_embed_path=None, decoder_ffn_embed_dim=512, decoder_input_dim=256, decoder_layers=6, decoder_learned_pos=False, decoder_normalize_before=False, decoder_output_dim=256, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=1, dropout=0.3, emb_dir='DATA/wiki25/embeddings/RoBERTa-large-top24', encode_state_machine=None, encoder_attention_heads=4, encoder_embed_dim=256, encoder_embed_path=None, encoder_ffn_embed_dim=512, encoder_layers=6, encoder_learned_pos=False, encoder_normalize_before=False, find_unused_parameters=False, fix_batches_to_gpus=False, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, keep_interval_updates=-1, keep_last_epochs=6, label_smoothing=0.01, lazy_load=False, left_pad_source='True', left_pad_target='False', log_format='json', log_interval=1000, loss_coef=1.0, lr=[0.0005], lr_scheduler='inverse_sqrt', max_epoch=10, max_sentences=None, max_sentences_valid=None, max_source_positions=1024, max_target_positions=1024, max_tokens=3584, max_tokens_valid=3584, max_update=0, maximize_best_checkpoint_metric=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_lr=1e-09, no_bert_precompute=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_token_positional_embeddings=False, num_workers=1, optimizer='adam', optimizer_overrides='{}', pointer_dist_decoder_selfattn_avg=0, pointer_dist_decoder_selfattn_heads=1, pointer_dist_decoder_selfattn_infer=5, pointer_dist_decoder_selfattn_layers=[5], pretrained_embed_dim=1024, raw_text=False, required_batch_size_multiple=8, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', save_dir='DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42', save_interval=1, save_interval_updates=0, seed=42, sentence_avg=False, share_all_embeddings=False, share_decoder_input_output_embed=0, shift_pointer_value=1, skip_invalid_size_inputs_valid_test=False, source_lang=None, target_lang=None, task='amr_action_pointer_graphmp', tbmf_wrapper=False, tensorboard_logdir='DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42', tgt_factored_emb_out=0, tgt_graph_heads=2, tgt_graph_layers=[0, 1, 2], tgt_graph_mask='allprev_1in1out', tgt_input_src_backprop=1, tgt_input_src_combine='add', tgt_input_src_emb='top', tgt_src_align_focus=['p0c1n0', 'p0c0n*'], tgt_src_align_heads=2, tgt_src_align_layers=[0, 1, 2, 3, 4, 5], threshold_loss_scale=None, tokenizer=None, train_subset='train', update_freq=[1], upsample_primary=1, use_bmuf=False, user_dir='../fairseq_ext', valid_subset='valid', validate_interval=1, warmup_init_lr=1e-07, warmup_updates=4000, weight_decay=0.0)
| [en] dictionary: 248 types
| [actions_nopos] dictionary: 128 types
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.en
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.bert
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.wordpieces
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.wp2w
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.nopos_in
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.nopos_out
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.pos
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.vocab_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.src_cursors
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_1stnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_cur_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_cur_1stnode_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_directions
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_directions
TransformerTgtPointerGraphMPModel(
  (encoder): TransformerEncoder(
    (subspace): Linear(in_features=1024, out_features=256, bias=False)
    (embed_tokens): Embedding(248, 256, padding_idx=1)
    (embed_positions): SinusoidalPositionalEmbedding()
    (layers): ModuleList(
      (0): TransformerEncoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (1): TransformerEncoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (2): TransformerEncoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (3): TransformerEncoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (4): TransformerEncoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (5): TransformerEncoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
    )
  )
  (decoder): TransformerDecoder(
    (embed_tokens): Embedding(128, 256, padding_idx=1)
    (embed_positions): SinusoidalPositionalEmbedding()
    (layers): ModuleList(
      (0): TransformerDecoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (encoder_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (encoder_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (1): TransformerDecoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (encoder_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (encoder_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (2): TransformerDecoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (encoder_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (encoder_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (3): TransformerDecoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (encoder_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (encoder_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (4): TransformerDecoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (encoder_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (encoder_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
      (5): TransformerDecoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (self_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (encoder_attn): MultiheadAttention(
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (encoder_attn_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
        (fc1): Linear(in_features=256, out_features=512, bias=True)
        (fc2): Linear(in_features=512, out_features=256, bias=True)
        (final_layer_norm): FusedLayerNorm(torch.Size([256]), eps=1e-05, elementwise_affine=True)
      )
    )
  )
)
| model transformer_tgt_pointer_graphmp, criterion LabelSmoothedCrossEntropyPointerCriterion
| num. model params: 8298496 (num. trained: 8298496)
| training on 1 GPUs
| max tokens per GPU = 3584 and max sentences per GPU = None
| no existing checkpoint found DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42/checkpoint_last.pt
| loading train data for epoch 0
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.en
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/train.en-actions.en.bert
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/train.en-actions.en.wordpieces
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/train.en-actions.en.wp2w
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.nopos_in
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.nopos_out
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.pos
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.vocab_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.src_cursors
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_1stnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_cur_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_cur_1stnode_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_directions
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_allpre_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_allpre_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_allpre_directions
| NOTICE: your device may support faster training with --fp16
../aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
../aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
../aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
../aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
../aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
../aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
../torch/csrc/autograd/python_anomaly_mode.cpp:57: UserWarning: Traceback of forward call that caused the error:
  File "fairseq_ext/train.py", line 338, in <module>
    cli_main()
  File "fairseq_ext/train.py", line 334, in cli_main
    main(args)
  File "fairseq_ext/train.py", line 103, in main
    train(args, trainer, task, epoch_itr)
  File "fairseq_ext/train.py", line 149, in train
    log_output = trainer.train_step(samples)
  File "/opt/conda/lib/python3.6/site-packages/fairseq/trainer.py", line 264, in train_step
    ignore_grad
  File "/workspace/transition-amr-torch03/fairseq_ext/tasks/amr_action_pointer_graphmp.py", line 463, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/transition-amr-torch03/fairseq_ext/criterions/label_smoothed_cross_entropy_pointer.py", line 106, in forward
    loss_pos, nll_loss_pos = self.compute_pointer_loss(net_output, sample, reduce=reduce)
  File "/workspace/transition-amr-torch03/fairseq_ext/criterions/label_smoothed_cross_entropy_pointer.py", line 150, in compute_pointer_loss
    attn = torch.log(attn)

Traceback (most recent call last):
  File "fairseq_ext/train.py", line 338, in <module>
    cli_main()
  File "fairseq_ext/train.py", line 334, in cli_main
    main(args)
  File "fairseq_ext/train.py", line 103, in main
    train(args, trainer, task, epoch_itr)
  File "fairseq_ext/train.py", line 149, in train
    log_output = trainer.train_step(samples)
  File "/opt/conda/lib/python3.6/site-packages/fairseq/trainer.py", line 287, in train_step
    raise e
  File "/opt/conda/lib/python3.6/site-packages/fairseq/trainer.py", line 264, in train_step
    ignore_grad
  File "/workspace/transition-amr-torch03/fairseq_ext/tasks/amr_action_pointer_graphmp.py", line 470, in train_step
    optimizer.backward(loss)
  File "/opt/conda/lib/python3.6/site-packages/fairseq/optim/fairseq_optimizer.py", line 75, in backward
    loss.backward()
  File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 93, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: Function 'LogBackward' returned nan values in its 0th output.

src_fixed_embeddings.sizes. AttributeError: 'NoneType' object has no attribute 'sizes'

I was just trying to train a simple model on the PTB and I got the following error. Could you tell me what's wrong? Thanks in advance.

stage-1: Preprocess
stage-2/3: Training/Testing (multiple seeds)
Saved fairseq model args to DATA/dep-parsing/models/PTB_RoBERTa-base_stnp6x6-seed42/config.json
fairseq-train
DATA/dep-parsing/features/PTB_RoBERTa-base
--max-epoch 80
--arch stack_transformer_6x6_nopos
--optimizer adam
--adam-betas '(0.9,0.98)'
--clip-norm 0.0
--lr-scheduler inverse_sqrt
--warmup-init-lr 1e-07
--warmup-updates 4000
--pretrained-embed-dim 768
--lr 0.0005
--min-lr 1e-09
--dropout 0.3
--weight-decay 0.0
--criterion label_smoothed_cross_entropy
--label-smoothing 0.01
--keep-last-epochs 40
--max-tokens 3584
--log-format json
--fp16
--seed 42 --save-dir DATA/dep-parsing/models/PTB_RoBERTa-base_stnp6x6-seed42
CUDA? True
| distributed init (rank 1): tcp://localhost:10159
CUDA? True
| distributed init (rank 0): tcp://localhost:10159
| initialized host flash as rank 1
| initialized host flash as rank 0
Namespace(activation_dropout=0.0, activation_fn='relu', adam_betas="'(0.9,0.98)'", adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, arch='stack_transformer_6x6_nopos', attention_dropout=0.0, bert_backprop=False, best_checkpoint_metric='loss', bpe=None, bucket_cap_mb=25, burnthrough=0, clip_norm=0.0, cpu=False, criterion='label_smoothed_cross_entropy', curriculum=0, data='DATA/dep-parsing/features/PTB_RoBERTa-base', dataset_impl=None, ddp_backend='c10d', decoder_attention_heads=4, decoder_embed_dim=256, decoder_embed_path=None, decoder_ffn_embed_dim=512, decoder_input_dim=256, decoder_layers=6, decoder_learned_pos=False, decoder_normalize_before=False, decoder_output_dim=256, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method='tcp://localhost:10159', distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=2, dropout=0.3, encode_state_machine='all-layers_nopos', encoder_attention_heads=4, encoder_embed_dim=256, encoder_embed_path=None, encoder_ffn_embed_dim=512, encoder_layers=6, encoder_learned_pos=False, encoder_normalize_before=False, find_unused_parameters=False, fix_batches_to_gpus=False, fp16=True, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, keep_interval_updates=-1, keep_last_epochs=40, label_smoothing=0.01, lazy_load=False, left_pad_source='True', left_pad_target='False', log_format='json', log_interval=1000, lr=[0.0005], lr_scheduler='inverse_sqrt', max_epoch=80, max_sentences=None, max_sentences_valid=None, max_source_positions=1024, max_target_positions=1024, max_tokens=3584, max_tokens_valid=3584, max_update=0, maximize_best_checkpoint_metric=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_lr=1e-09, no_bert_precompute=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_token_positional_embeddings=False, num_workers=1, optimizer='adam', optimizer_overrides='{}', pretrained_embed_dim=768, raw_text=False, required_batch_size_multiple=8, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', save_dir='DATA/dep-parsing/models/PTB_RoBERTa-base_stnp6x6-seed42', save_interval=1, save_interval_updates=0, seed=42, sentence_avg=False, share_all_embeddings=False, share_decoder_input_output_embed=False, skip_invalid_size_inputs_valid_test=False, source_lang=None, target_lang=None, task='translation', tbmf_wrapper=False, tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, train_subset='train', update_freq=[1], upsample_primary=1, use_bmuf=False, user_dir=None, valid_subset='valid', validate_interval=1, warmup_init_lr=1e-07, warmup_updates=4000, weight_decay=0.0)
| [en] dictionary: 45272 types
| [actions] dictionary: 88 types
| loaded 1700 examples from: DATA/dep-parsing/features/PTB_RoBERTa-base/valid.en-actions.en
| loaded 1700 examples from: DATA/dep-parsing/features/PTB_RoBERTa-base/valid.en-actions.actions
| DATA/dep-parsing/features/PTB_RoBERTa-base valid en-actions 1700 examples
Traceback (most recent call last):
File "/home/liu/anaconda3/envs/seq2seq/bin/fairseq-train", line 33, in
sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())
File "/home/liu/transition-amr-parser-0.3.3/fairseq-stack-transformer/fairseq_cli/train.py", line 335, in cli_main
nprocs=args.distributed_world_size,
File "/home/liu/anaconda3/envs/seq2seq/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 167, in spawn
while not spawn_context.join():
File "/home/liu/anaconda3/envs/seq2seq/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 114, in join
raise Exception(msg)
Exception:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/home/liu/anaconda3/envs/seq2seq/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/liu/transition-amr-parser-0.3.3/fairseq-stack-transformer/fairseq_cli/train.py", line 302, in distributed_main
main(args, init_distributed=True)
File "/home/liu/transition-amr-parser-0.3.3/fairseq-stack-transformer/fairseq_cli/train.py", line 47, in main
task.load_dataset(valid_sub_split, combine=False, epoch=0)
File "/home/liu/transition-amr-parser-0.3.3/fairseq-stack-transformer/fairseq/tasks/translation.py", line 251, in load_dataset
state_machine=state_machine
File "/home/liu/transition-amr-parser-0.3.3/fairseq-stack-transformer/fairseq/tasks/translation.py", line 126, in load_langpair_dataset
src_fixed_embeddings, src_fixed_embeddings.sizes,
AttributeError: 'NoneType' object has no attribute 'sizes'

Annotations has no method `toJAMRString`

Hi, I am trying to use the model through another Python script following the directions in the README file.
I am getting the error that the toJAMRString method is not found and it seems that the annotations object is a list of strings.

Neuro-Symbolic Reasoner

I was able to train the model as per your instructions. But my ultimate goal is to be able to build a Neuro-Symbolic Reasoning system similar to this Currently I have the "Semantic Parsing" part of the system working. But I still need to get a "Reasoning" part (namely AMR-to-logic and Logical Neural Networks). Any information about them (or links to the repositories) would be greatly appreciated.

problem with fairseq-preprocess

I was able to install everything as per your setup instructions. I run the training script bash scripts/stack-transformer/experiment.sh configs/amr2_o5+Word100_roberta.large.top24_stnp6x6.sh
The datasets for the training have been generated. But its getting stuck on running fairseq-preproces command. I've got the following error:

fairseq-preprocess: error: unrecognized arguments: --pretrained-embed roberta.large --bert-layers 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 --machine-type AMR --machine-rules DATA/AMR//oracles/amr2.0-cofill_o5+Word100//train.rules.json --entity-rules DATA/AMR//oracles/amr2.0-cofill_o5+Word100//entity_rules.json

Can you advise how to fix that?

Problems with CUDA out of memory

I attempted to train the model using bash run/run_experiment.sh configs/amr2.0-structured-bart-large-sep-voc.sh and it looks like my 12GB 2080Ti GPU doesn't have enough memory.
In fact, I had two 12GB 2080Ti GPU on server, but only one of them used during training.
Does the code use multi-GPUs? Is there anything else I need to modify ?

GPU requirements

I attempted to train the model using bash run/run_experiment.sh configs/amr2.0-structured-bart-large-sep-voc.sh and it looks like my older 12GB Titan X GPU doesn't have enough memory. Can you let me know what you used for training and approximately how long it takes to train.

In the above config file I tried changing BATCH_SIZE=128 to BATCH_SIZE=1 and I'm still getting CUDA OOM errors. Is there something else I need to modify to reduce the memory?

Do you know if this will train on a single 24GB GPU (ie RTX 3090) and if so, how long that takes.

"Unable to infer Criterion arguments, please implement LabelSmoothedCrossEntropyPointerCriterion.build_criterion" when running minimal_test.sh

I'm trying to get going on wsl2, and have been running into some trouble getting set up.

At first the scripts were searching for fairseq_ext on the wrong path: specifically I was getting FileNotFound errors on the path /ibm-amr/fairseq_ext. This made sense because fairseq_ext was of course located in /ibm-amr/transition-amr-parser/fairseq_ext.

So I went ahead and did a global find & replace changing ../fairseq_ext to ./fairseq_ext, which got me a little further. Unfortunately, the problem then was:

ImportError: Failed to import --user-dir=/home/aianta/ibm-amr/transition-amr-parser/fairseq_ext because the corresponding module name (fairseq_ext) is not globally unique. Please rename the directory to something unique and try again.

Renaming the folder did not help, but I reasoned, if the issue is a duplicate module, that might suggest that fairseq_ext was in fact already loaded. So I went ahead and commented out all instances of utils.import_user_module(usr_args) and sure enough, the minimal test proceeded further.

I'm now stuck on:

[Configuration file:]
configs/wiki25.sh
[Building oracle actions:]
[Configuration file:]
configs/wiki25.sh
Directory to aligner: DATA/wiki25/aligned/cofill/ already exists --- do nothing.
[normalize rules] months
[normalize rules] units
[normalize rules] cardinals
[normalize rules] ordinals
Reading DATA/wiki25/aligned/cofill//train.txt
25 sentences
216/293 node types/tokens
35/285 edge types/tokens
241/383 word types/tokens
Oracle: 25it [00:00, 431.06it/s]
Base actions:
Counter({'PRED': 63, 'RA': 20, 'LA': 19, 'ENTITY': 15, 'REDUCE': 1, 'SHIFT': 1, 'COPY_LEMMA': 1, 'COPY_SENSE01': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 198), ('REDUCE', 184), ('COPY_LEMMA', 62), ('MERGE', 26), ('LA(root)', 22), ('COPY_SENSE01', 20), ('LA(:ARG1)', 18), ('RA(:ARG1)', 13), ('LA(:ARG0)', 12), ('PRED(person)', 12)]
80 singleton actions
Counter({'PRED': 59, 'ENTITY': 9, 'RA': 8, 'LA': 4})
Reading DATA/wiki25/aligned/cofill//dev.txt
25 sentences
216/293 node types/tokens
35/285 edge types/tokens
241/383 word types/tokens
Oracle: 25it [00:00, 445.79it/s]
Base actions:
Counter({'PRED': 63, 'RA': 20, 'LA': 19, 'ENTITY': 15, 'REDUCE': 1, 'SHIFT': 1, 'COPY_LEMMA': 1, 'COPY_SENSE01': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 198), ('REDUCE', 184), ('COPY_LEMMA', 62), ('MERGE', 26), ('LA(root)', 22), ('COPY_SENSE01', 20), ('LA(:ARG1)', 18), ('RA(:ARG1)', 13), ('LA(:ARG0)', 12), ('PRED(person)', 12)]
80 singleton actions
Counter({'PRED': 59, 'ENTITY': 9, 'RA': 8, 'LA': 4})
Reading DATA/wiki25/aligned/cofill//test.txt
25 sentences
216/293 node types/tokens
35/285 edge types/tokens
241/383 word types/tokens
Oracle: 25it [00:00, 427.50it/s]
Base actions:
Counter({'PRED': 63, 'RA': 20, 'LA': 19, 'ENTITY': 15, 'REDUCE': 1, 'SHIFT': 1, 'COPY_LEMMA': 1, 'COPY_SENSE01': 1, 'MERGE': 1})
Most frequent actions:
[('SHIFT', 198), ('REDUCE', 184), ('COPY_LEMMA', 62), ('MERGE', 26), ('LA(root)', 22), ('COPY_SENSE01', 20), ('LA(:ARG1)', 18), ('RA(:ARG1)', 13), ('LA(:ARG0)', 12), ('PRED(person)', 12)]
80 singleton actions
Counter({'PRED': 59, 'ENTITY': 9, 'RA': 8, 'LA': 4})
[Preprocessing data:]
[Configuration file:]
configs/wiki25.sh
Cleaning up partially completed DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//
Namespace(user_dir='./fairseq_ext')
Namespace(alignfile=None, batch_normalize_reward=False, bert_layers=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24], bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//', embdir='DATA/wiki25/embeddings/RoBERTa-large-top24', fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gold_annotations=None, gold_episode_ratio=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', machine_rules=None, machine_type=None, memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer=None, padding_factor=8, pretrained_embed='roberta.large', scoring='bleu', seed=1, source_lang='en', srcdict=None, target_lang='actions', task='amr_action_pointer_graphmp', tbmf_wrapper=False, tensorboard_logdir='', testpref='DATA/wiki25/oracles/cofill_o8.3_act-states//test', tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, trainpref='DATA/wiki25/oracles/cofill_o8.3_act-states//train', user_dir='./fairseq_ext', validpref='DATA/wiki25/oracles/cofill_o8.3_act-states//dev', workers=1)
| [en] Dictionary: 247 types
| [en] DATA/wiki25/oracles/cofill_o8.3_act-states//train.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
| [en] Dictionary: 247 types
| [en] DATA/wiki25/oracles/cofill_o8.3_act-states//dev.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
| [en] Dictionary: 247 types
| [en] DATA/wiki25/oracles/cofill_o8.3_act-states//test.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
----------------------------------------------------------------------------------------------------
Generate and process action states information (number of workers: 1):
[English sentence file: DATA/wiki25/oracles/cofill_o8.3_act-states//train.en]
[AMR actions file: DATA/wiki25/oracles/cofill_o8.3_act-states//train.actions]
processing ...
finished !
Processed data saved to path with prefix: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions
Total time elapsed: 0s
----------------------------------------------------------------------------------------------------
| [actions] DATA/wiki25/oracles/cofill_o8.3_act-states//train.actions_nopos: 25 sents, 796 tokens, 0.0% replaced by <unk>
----------------------------------------------------------------------------------------------------
Generate and process action states information (number of workers: 1):
[English sentence file: DATA/wiki25/oracles/cofill_o8.3_act-states//dev.en]
[AMR actions file: DATA/wiki25/oracles/cofill_o8.3_act-states//dev.actions]
processing ...
finished !
Processed data saved to path with prefix: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions
Total time elapsed: 0s
----------------------------------------------------------------------------------------------------
| [actions] DATA/wiki25/oracles/cofill_o8.3_act-states//dev.actions_nopos: 25 sents, 796 tokens, 0.0% replaced by <unk>
----------------------------------------------------------------------------------------------------
Generate and process action states information (number of workers: 1):
[English sentence file: DATA/wiki25/oracles/cofill_o8.3_act-states//test.en]
[AMR actions file: DATA/wiki25/oracles/cofill_o8.3_act-states//test.actions]
processing ...
finished !
Processed data saved to path with prefix: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//test.en-actions.actions
Total time elapsed: 0s
----------------------------------------------------------------------------------------------------
| [actions] DATA/wiki25/oracles/cofill_o8.3_act-states//test.actions_nopos: 25 sents, 796 tokens, 0.0% replaced by <unk>
Using cache found in /home/aianta/.cache/torch/hub/pytorch_fairseq_main
Using roberta.large extraction in GPU

Using cache found in /home/aianta/.cache/torch/hub/pytorch_fairseq_main
Using roberta.large extraction in GPU

Using cache found in /home/aianta/.cache/torch/hub/pytorch_fairseq_main
Using roberta.large extraction in GPU

| Wrote preprocessed oracle data to DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//
| Wrote preprocessed embedding data to DATA/wiki25/embeddings/RoBERTa-large-top24
[Training:]
[Configuration file:]
configs/wiki25.sh
Namespace(activation_dropout=0.0, activation_fn='relu', adam_betas='(0.9,0.98)', adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, all_gather_list_size=16384, append_eos_to_target=0, apply_tgt_actnode_masks=0, apply_tgt_input_src=0, apply_tgt_src_align=1, apply_tgt_vocab_masks=1, arch='transformer_tgt_pointer_graphmp', attention_dropout=0.0, batch_size=None, batch_size_valid=None, bert_backprop=False, best_checkpoint_metric='loss', bf16=False, bpe=None, broadcast_buffers=False, bucket_cap_mb=25, checkpoint_shard_count=1, checkpoint_suffix='', clip_norm=0.0, collate_tgt_states=1, cpu=False, criterion='label_smoothed_cross_entropy_pointer', curriculum=0, data='DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//', data_buffer_size=10, dataset_impl=None, ddp_backend='c10d', decoder_attention_heads=4, decoder_embed_dim=256, decoder_embed_path=None, decoder_ffn_embed_dim=512, decoder_input_dim=256, decoder_layers=6, decoder_learned_pos=False, decoder_normalize_before=False, decoder_output_dim=256, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_num_procs=1, distributed_port=-1, distributed_rank=0, distributed_world_size=1, distributed_wrapper='DDP', dropout=0.3, emb_dir='DATA/wiki25/embeddings/RoBERTa-large-top24', empty_cache_freq=0, encode_state_machine=None, encoder_attention_heads=4, encoder_embed_dim=256, encoder_embed_path=None, encoder_ffn_embed_dim=512, encoder_layers=6, encoder_learned_pos=False, encoder_normalize_before=False, fast_stat_sync=False, find_unused_parameters=False, finetune_from_model=None, fix_batches_to_gpus=False, fixed_validation_seed=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', keep_best_checkpoints=-1, keep_interval_updates=-1, keep_last_epochs=6, label_smoothing=0.01, lazy_load=False, left_pad_source='True', left_pad_target='False', localsgd_frequency=3, log_format='json', log_interval=100, loss_coef=1.0, lr=[0.0005], lr_scheduler='inverse_sqrt', max_epoch=10, max_source_positions=1024, max_target_positions=1024, max_tokens=3584, max_tokens_valid=3584, max_update=0, maximize_best_checkpoint_metric=False, memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_lr=1e-09, model_parallel_size=1, no_bert_precompute=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_seed_provided=False, no_token_positional_embeddings=False, nprocs_per_node=1, num_shards=1, num_workers=1, optimizer='adam', optimizer_overrides='{}', patience=-1, pipeline_balance=None, pipeline_checkpoint='never', pipeline_chunks=0, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_devices=None, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_model_parallel=False, pointer_dist_decoder_selfattn_avg=0, pointer_dist_decoder_selfattn_heads=1, pointer_dist_decoder_selfattn_infer=5, pointer_dist_decoder_selfattn_layers=[5], pretrained_embed_dim=1024, profile=False, quantization_config_path=None, raw_text=False, required_batch_size_multiple=8, required_seq_len_multiple=1, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', save_dir='DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42', save_interval=1, save_interval_updates=0, scoring='bleu', seed=42, sentence_avg=False, shard_id=0, share_all_embeddings=False, share_decoder_input_output_embed=0, shift_pointer_value=1, skip_invalid_size_inputs_valid_test=False, slowmo_algorithm='LocalSGD', slowmo_momentum=None, source_lang=None, stop_time_hours=0, target_lang=None, task='amr_action_pointer_graphmp', tensorboard_logdir='DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42', tgt_factored_emb_out=0, tgt_graph_heads=2, tgt_graph_layers=[0, 1, 2], tgt_graph_mask='allprev_1in1out', tgt_input_src_backprop=1, tgt_input_src_combine='add', tgt_input_src_emb='top', tgt_src_align_focus=['p0c1n0', 'p0c0n*'], tgt_src_align_heads=2, tgt_src_align_layers=[0, 1, 2, 3, 4, 5], threshold_loss_scale=None, tokenizer=None, tpu=False, train_subset='train', update_freq=[1], upsample_primary=1, use_bmuf=False, use_old_adam=False, user_dir='./fairseq_ext', valid_subset='valid', validate_after_updates=0, validate_interval=1, validate_interval_updates=0, warmup_init_lr=1e-07, warmup_updates=4000, weight_decay=0.0, zero_sharding='none')
| [en] dictionary: 248 types
| [actions_nopos] dictionary: 128 types
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.en
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.bert
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.wordpieces
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.wp2w
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.nopos_in
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.nopos_out
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.pos
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.vocab_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.src_cursors
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_1stnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_cur_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_cur_1stnode_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_directions
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_directions
Traceback (most recent call last):
  File "fairseq_ext/train.py", line 338, in <module>
    cli_main()
  File "fairseq_ext/train.py", line 334, in cli_main
    main(args)
  File "fairseq_ext/train.py", line 73, in main
    criterion = task.build_criterion(args)
  File "/home/aianta/anaconda3/envs/amr-2/lib/python3.7/site-packages/fairseq/tasks/fairseq_task.py", line 289, in build_criterion
    return criterions.build_criterion(args, self)
  File "/home/aianta/anaconda3/envs/amr-2/lib/python3.7/site-packages/fairseq/criterions/__init__.py", line 31, in build_criterion
    return build_criterion_(criterion_cfg, task)
  File "/home/aianta/anaconda3/envs/amr-2/lib/python3.7/site-packages/fairseq/registry.py", line 54, in build_x
    return builder(args, *extra_args, **extra_kwargs)
  File "/home/aianta/anaconda3/envs/amr-2/lib/python3.7/site-packages/fairseq/criterions/fairseq_criterion.py", line 57, in build_criterion
    "{}.build_criterion".format(cls.__name__)
NotImplementedError: Unable to infer Criterion arguments, please implement LabelSmoothedCrossEntropyPointerCriterion.build_criterion

This is my conda list:

# packages in environment at /home/aianta/anaconda3/envs/amr-2:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
_openmp_mutex             5.1                       1_gnu
antlr-python-runtime      4.8                pyhd8ed1ab_3    conda-forge
astunparse                1.6.3                      py_0
attrs                     21.4.0             pyhd3eb1b0_0
blas                      1.0                         mkl
brotlipy                  0.7.0           py37h27cfd23_1003
bzip2                     1.0.8                h7b6447c_0
c-ares                    1.18.1               h7f8727e_0
ca-certificates           2022.4.26            h06a4308_0
catalogue                 1.0.0                    py37_1
certifi                   2022.6.15        py37h06a4308_0
cffi                      1.15.0           py37hd667e15_1
charset-normalizer        2.0.4              pyhd3eb1b0_0
cmake                     3.22.1               h1fce559_0
colorama                  0.4.5              pyhd8ed1ab_0    conda-forge
cryptography              37.0.1           py37h9ce1e76_0
cudatoolkit               11.3.1               h2bc3f7f_2
cymem                     2.0.6            py37h295c915_0
cython                    0.29.30          py37hd23a5d3_0    conda-forge
cython-blis               0.7.7            py37hce1f21e_0
dataclasses               0.8                pyh6d0b6a4_7
en-core-web-sm            2.3.1                    pypi_0    pypi
expat                     2.4.4                h295c915_0
fairseq                   0.10.2           py37hdc94413_0    conda-forge
ffmpeg                    4.3                  hf484d3e_0    pytorch
freetype                  2.10.4               h0708190_1    conda-forge
future                    0.18.2                   py37_1
gmp                       6.2.1                h58526e2_0    conda-forge
gnutls                    3.6.13               h85f3911_1    conda-forge
hydra-core                1.1.1              pyhd8ed1ab_0    conda-forge
idna                      3.3                pyhd3eb1b0_0
importlib-metadata        4.11.3           py37h06a4308_0
importlib_metadata        4.11.3               hd3eb1b0_0
importlib_resources       5.8.0              pyhd8ed1ab_0    conda-forge
intel-openmp              2021.4.0          h06a4308_3561
jpeg                      9e                   h166bdaf_1    conda-forge
jsonschema                3.0.2                    py37_0
krb5                      1.19.2               hac12032_0
lame                      3.100             h7f98852_1001    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.38                 h1181459_1
libcurl                   7.82.0               h0b77cf5_0
libedit                   3.1.20210910         h7f8727e_0
libev                     4.33                 h7f8727e_1
libffi                    3.3                  he6710b0_2
libgcc-ng                 11.2.0               h1234567_1
libgomp                   11.2.0               h1234567_1
libiconv                  1.17                 h166bdaf_0    conda-forge
libnghttp2                1.46.0               hce63b2e_0
libpng                    1.6.37               h21135ba_2    conda-forge
libssh2                   1.10.0               h8f2d780_0
libstdcxx-ng              11.2.0               h1234567_1
libtiff                   4.2.0                h2818925_1
libuv                     1.40.0               h7b6447c_0
libwebp-base              1.2.2                h7f98852_1    conda-forge
lz4-c                     1.9.3                h295c915_1
mkl                       2021.4.0           h06a4308_640
mkl-include               2022.0.1           h06a4308_117
mkl-service               2.4.0            py37h7f8727e_0
mkl_fft                   1.3.1            py37hd3c417c_0
mkl_random                1.2.2            py37h51133e4_0
murmurhash                1.0.7            py37h295c915_0
ncurses                   6.3                  h7f8727e_2
nettle                    3.6                  he412f7d_0    conda-forge
ninja                     1.10.2               h06a4308_5
ninja-base                1.10.2               hd09550d_5
numpy                     1.21.5           py37h6c91a56_3
numpy-base                1.21.5           py37ha15fc14_3
olefile                   0.46               pyh9f0ad1d_1    conda-forge
omegaconf                 2.1.1            py37h89c1867_1    conda-forge
openh264                  2.1.1                h780b84a_0    conda-forge
openssl                   1.1.1o               h7f8727e_0
packaging                 21.3               pyhd3eb1b0_0
pillow                    7.2.0            py37h718be6c_2    conda-forge
pip                       21.2.2           py37h06a4308_0
plac                      1.1.0                    py37_1
portalocker               2.4.0            py37h89c1867_0    conda-forge
preshed                   3.0.6            py37h295c915_0
pycparser                 2.21               pyhd3eb1b0_0
pyopenssl                 22.0.0             pyhd3eb1b0_0
pyparsing                 3.0.4              pyhd3eb1b0_0
pyrsistent                0.18.0           py37heee7806_0
pysocks                   1.7.1                    py37_1
python                    3.7.13               h12debd9_0
python_abi                3.7                     2_cp37m    conda-forge
pytorch                   1.10.1          py3.7_cuda11.3_cudnn8.2.0_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pyyaml                    6.0              py37h7f8727e_1
readline                  8.1.2                h7f8727e_1
regex                     2022.4.24        py37h540881e_0    conda-forge
requests                  2.27.1             pyhd3eb1b0_0
rhash                     1.4.1                h3c74f83_1
sacrebleu                 2.1.0              pyhd8ed1ab_0    conda-forge
setuptools                61.2.0           py37h06a4308_0
six                       1.16.0             pyhd3eb1b0_1
spacy                     2.3.5            py37hff7bd54_0
sqlite                    3.38.5               hc218d9a_0
srsly                     1.0.5            py37h2531618_0
tabulate                  0.8.10             pyhd8ed1ab_0    conda-forge
thinc                     7.4.5            py37h9a67853_0
tk                        8.6.12               h1ccaba5_0
torch                     1.13.0a0+git4300f64          pypi_0    pypi
torchaudio                0.10.1               py37_cu113    pytorch
torchvision               0.11.2               py37_cu113    pytorch
tqdm                      4.64.0           py37h06a4308_0
typing                    3.10.0.0           pyhd8ed1ab_0    conda-forge
typing_extensions         4.1.1              pyh06a4308_0
urllib3                   1.26.9           py37h06a4308_0
wasabi                    0.9.1            py37h06a4308_0
wheel                     0.37.1             pyhd3eb1b0_0
xz                        5.2.5                h7f8727e_1
yaml                      0.2.5                h7b6447c_0
zipp                      3.8.0            py37h06a4308_0
zlib                      1.2.12               h7f8727e_2
zstd                      1.5.2                ha4553b6_0

The correctly_installed.sh script returns:

pytorch 1.10.1
cuda 11.3
Apex not installed
Pytorch binaries were compiled with Cuda 11.3 but binary /usr/local/cuda/bin/nvcc is 11.4,
fairseq 0.10.2
spacy 2.3.5
[OK] correctly installed

Are the pytorch libraries the issue here?

No same-as edges in example for docamr

Running the example docamr code below does not give same-as edges for coreference:

from transition_amr_parser.parse import AMRParser

# Download and save the docamr model to cache
parser = AMRParser.from_pretrained('doc-sen-conll-amr-seed42')

# Sentences in the doc
doc = ["Hailey likes to travel." ,"She is going to London tomorrow.", "She will walk to Big Ben when she goes to London."]

# tokenize sentences if not already tokenized
tok_sentences = []
for sen in doc:
    tokens, positions = parser.tokenize(sen)
    tok_sentences.append(tokens)

# parse docs takes a list of docs as input
annotations, machines = parser.parse_docs([tok_sentences])

# Print Penman notation
print(annotations[0])

# Print Penman notation without JAMR, with ISI
amr = machines[0].get_amr()
print(amr.to_penman(jamr=False, isi=True))

# Plot the graph (requires matplotlib)
amr.plot()

Instead I get a warning for disconnected graphs and this output:
image

Integrating stack-transformer into another project?

Hi,

I was looking into integrating stack-transformer into a transition based constituency parser. My initial reading of this codebase makes me think the stack transformer is pretty tightly coupled with the parser actions. Is there a easier entry point for using just the stack transformer with our own inputs?

Thanks!

file not found error

Hi, I tried to train the model but cannot find the file indicated in README, is it an empty directory that we need to create manually?

Error message:
No such file or directory: '/path/to/LDC2017T10/data/amrs/split/training/'

`parse_sentences()` crashes when there is no root

Thank you for this great library! I have encountered a bug that occurs when parsing a large list of strings in which there are certain strings that generate AMR representations without roots. This makes parse_sentences() crash with the following error:

WARNING: missing root

File ~/python/venvs/amr/lib/python3.8/site-packages/transition_amr_parser/parse.py:1067, in AMRParser.parse_sentences(self, batch, batch_size, roberta_batch_size, gold_amrs, force_actions, beam, jamr, no_isi, unicode_normalize)
   1063 machine_nbest = completed_machines[i]
   1065 if len(machine_nbest) == 1:
   1066     annotations.append(
-> 1067         machine_nbest[0].get_annotation(jamr=jamr, no_isi=no_isi)
   1068     )
   1069     machines.append(machine_nbest[0])
   1070 else:

File ~/python/venvs/amr/lib/python3.8/site-packages/transition_amr_parser/amr_machine.py:1307, in AMRStateMachine.get_annotation(self, node_map, jamr, no_isi)
   1304     return self.align_tracker.add_alignments_to_penman(self)
   1306 else:
-> 1307     amr = self.get_amr(node_map=node_map)
   1308     return amr.to_penman(jamr=jamr, isi=not no_isi)

File ~/python/venvs/amr/lib/python3.8/site-packages/transition_amr_parser/amr_machine.py:1287, in AMRStateMachine.get_amr(self, node_map)
   1282 tokens, nodes, edges, root, alignments = create_valid_amr(
   1283     self.tokens, self.nodes, self.edges, self.root, self.alignments
   1284 )
   1286 # create an AMR class
-> 1287 amr = AMR(tokens, nodes, edges, root, alignments=alignments)
   1289 # use valid node names
   1290 if node_map is None:

File ~/python/venvs/amr/lib/python3.8/site-packages/transition_amr_parser/amr.py:713, in AMR.__init__(self, tokens, nodes, edges, root, penman, alignments, sentence, id, sentence_ends)
    711 self.root = root
    712 self.roots = []
--> 713 if self.nodes[self.root] == 'document':
    714     for (s, rel, t) in self.edges:
    715         if s == self.root and rel.startswith(':snt'):

KeyError: None

My current workaround is to use parse_sentence() with a try/except, but it would be better to be able to send the sentences in batches using parse_sentences().

Steps to reproduce

from transition_amr_parser.parse import AMRParser

model_name = "AMR2-joint-ontowiki-seed42"
amr_parser = AMRParser.from_pretrained(model_name)

sentences = ["The next string has no root.","!"]
sentences_tokens = [parser.tokenize(i)[0] for i in sentences]
annotations, machines = parser.parse_sentences(sentences_tokens)

Instruction of Successfully Installing JAMR for obtaining Alignment

The original JAMR was installed in 2015 or 2016 in the server, thus some packages were broken or not updated. The default setup script in JAMR repos is set to a version no longer available.
I have pushed a new JAMR repos to deal with this issue. More details on jflanigan/jamr#44.

The most important part is to install the updated version of scala and sbt.
To manually install them, you can use SDKMAN! and run the following:

sdk install sbt 1.0.2
sdk install scala 2.11.12

And then you could replace

git clone https://github.com/jflanigan/jamr.git
with https://github.com/DreamerDeo/JAMR.git , where the scala version are clarified in some config files.
(of course, you could also clone the original JAMR repos and do this manually)

Where does entity_rules.json get generated?

I trained a model according to the instructions, and it seems like there was a file entity_rules.json that was supposed to be generated. It looks like these rules are necessary for post-processing, but I am really confused about where in the preprocessing/training code the file gets created.

Stochastic results with the pretrained parser

from transition_amr_parser.parse import AMRParser
parser = AMRParser.from_checkpoint('DATA/amr2joint_ontowiki2_g2g/models/amr2joint_ontowiki2_g2g-structured-bart-large/seed44/checkpoint_wiki.smatch_top5-avg.pt')
# use parse_sentences() for a batch of sentences
tokens, positions = parser.tokenize('The coronavirus pandemic has upended the traditional runway format, and in its place a mix of virtual and, in some cases, physical shows with limited audience numbers has started to roll out.')
annotations, decoding_data = parser.parse_sentence(tokens)
# Print Penman 
print(annotations)

Given the above scripts.

Sometimes the logs will show WARNING: disconnected graphs, sometimes not, resulting inconsistent results (penman notations in here) from the same sentence and same pretrained weights.

For examples:

Running on batch size: 1
1
decoding: 100%|██████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.19it/s]
WARNING: disconnected graphs
# ::tok The coronavirus pandemic has upended the traditional runway format , and in its place a mix of virtual and , in some cases , physical shows with limited audience numbers has started to roll out .
(a / and~10
    :op1 (u / up-01~4
        :ARG0 (p / pandemic~2
            :mod (v2 / virus~1))
        :ARG1 (f / format~8
            :ARG1-of (s5 / stud-01~7)
            :mod (r3 / runway~7)
            :mod u))
    :op2 (s4 / start-01~31
        :ARG1 (r2 / roll-out-02~33
            :ARG1 (m / mix-01~15
                :ARG1 (s2 / show-04~25
                    :mod (v / virtual~17))
                :ARG2 (s / show-04~18
                    :mod (c / case-04~22
                        :quant (s3 / some~21))
                    :mod (p2 / physical~24)
                    :prep-with (n / number~29
                        :ARG1-of (l / limit-01~27)
                        :quant-of (a2 / audience~28)))))
        :ARG2-of (r / replace-01~13
            :ARG1 f))
    :rel (t / tradition~6))

and

Running on batch size: 1
1
decoding: 100%|██████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.40s/it]
# ::tok The coronavirus pandemic has upended the traditional runway format , and in its place a mix of virtual and , in some cases , physical shows with limited audience numbers has started to roll out .
(a / and~10
    :op1 (u / upheaval-01~4
        :ARG0 (p / pandemic~2
            :mod (c2 / coronavirus~1))
        :ARG1 (f / format~8
            :mod (r3 / runway~7)
            :mod (t / tradition~6)))
    :op2 (s4 / start-01~31
        :ARG1 (r2 / roll-out-02~33
            :ARG1 (m / mix-01~15
                :ARG1 (s2 / show-04~25
                    :mod (v / virtual~17))
                :ARG2 (s / show-04~16
                    :mod (c / case-04~22
                        :quant (s3 / some~21))
                    :mod (p2 / physical~24)
                    :prep-with (n / number~29
                        :ARG1-of (l / limit-01~27)
                        :quant-of (a2 / audience~28)))))
        :ARG2-of (r / replace-01~13
            :ARG1 f)))

Is it expected for this kind of behaviors?

Thank you,

Use pretrained model without training data

Currently the option for training data -A is required to run the parser. There should be an option for loading a model and decoding without needing to supply AMR train data or BERT embeddings for training.

Where to send email to get access to pretrained models?

I'm looking to generate some AMRs from application logs, I saw that some pre-trained models were available upon request:
We provide these on an individual basis (sends us an email).

But I didn't find an email in the readme to send the request to.

Question about cloning

Hi!

I have tried to clone the repository as in the README.md (git clone [email protected]:mnlp/transition-amr-parser.git), but as response I always get
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists._

Already added a ssh key to my git account to give me authorization and tested it through ssh -T [email protected]. Nonetheless, I still have the same problem when trying to clone the repository (even through https). Any suggestion on how to solve it?

Thanks,
Anton

problems with setup

I'm following your setup instructions, but when I run bash bash tests/minimal_test.sh I get the following error:

ModuleNotFoundError: No module named 'fairseq_ext.modules'

The result of bash tests/correctly_installed.sh:
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

pytorch 1.2.0
No CUDA available
fairseq 0.8.0
spacy 2.0.16
[OK] correctly installed

Question about decoding with pre-trained model by command line in README

Hi,
I'm having a problem with decoding using pre-trained model.
I followed the README.md for the installation, and testing for installation printed out "[OK] correctly installed"
Then, I skipped the command line "bash tests/minimal_test.sh", and did typed command line amr-parse -c ./DATA/amr2joint_ontowiki2_g2g/models~~~~ -i plain_textfile.txt -o output.amr.

It gives me such an error (amr-parse: command not found)
What did I miss here? or what should I do for setting up amr-parse command?

Thank you.

Best,
Paul

Bus Error when checking installation

Hello,

After installing the necessary packages with:

source set_environment.sh
pip install --editable .

When I run the bash tests/correctly_installed.sh, I get:

pytorch 1.10.1
No CUDA available
smatch installed
tests/correctly_installed.sh: line 10:  2319 Bus error: 10           python tests/correctly_installed.py

Would you happen to know what causes this?

Worth keeping in mind is that:

  • Machine has MacOS Ventura 13.0.1
  • I am using python 3.8

Thanks in advance!

Failed installed and "Needs linking cache DATA/EL/legacy_linker"

Hi,

When checking if correctly installed, it says failed installing although it's not very clear to me why. The hint being: "maybe LD_LIBRARY_PATH unconfigured?" (see attached). I've also downloaded manually spacy 2.0.16 thinking from this post that this could help but it did not #13

When running the test file, I also encounter another error: "Needs linking cache DATA/EL/legacy_linker_amr3.0/". Is there a file missing?
Bildschirmfoto 2022-02-09 um 17 25 31

Many thanks in advance for any help!
Best wishes

APT Checkpoints?

Hi @jzhou316!

The instructions say:

from transition_amr_parser.parse import AMRParser
parser = AMRParser.from_checkpoint(in_checkpoint)
annotations = parser.parse_sentences([['The', 'boy', 'travels'], ['He', 'visits', 'places']])
# Penman notation
print(''.join(annotations[0][0]))

Which 'checkpoint' am I supposed to here?

Thanks!

Zoher

Fetching multisentence AMR

Hi @ramon-astudillo ,

Is the process for fetching multi-sentence amr different from single sentence? Or do you still use

tokens, positions = parser.tokenize('The girl travels and visits places. She like traveling.')
annotations, decoding_data = parser.parse_sentence(tokens)

Thanks!

Question of Support

Hello, this is less of an 'issue' and more of a question: are there plans to support Python 3.11.X and Pytorch/torch-scatter 2.X? I'm asking in part because I have an Apple M1 Pro macbook and I haven't been able to get things working natively.

parser doesn't produce amr-unknown

I was able to train the parser as per your instructions. But when testing the trained model I found that it didn't produce amr-unknown node. For example:

Text: Which architect of Marine Corps Air Station Kaneohe Bay was also tenant of New Sanno hotel?
# ::node	1	person	1-2
# ::node	2	architect-01	1-2
# ::node	3	facility	3-9
# ::node	5	also	10-11
# ::node	6	reside-01	11-12
# ::node	7	company	13-16
# ::node	10	name	3-9
# ::node	11	"Marine"	3-9
# ::node	12	"Corps"	3-9
# ::node	13	"Air"	3-9
# ::node	14	"Station"	3-9
# ::node	15	"Kaneohe"	3-9
# ::node	16	"Bay"	3-9
# ::node	18	name	13-16
# ::node	19	"New"	13-16
# ::node	20	"Sanno"	13-16
# ::node	21	"Hotel"	13-16
# ::root	6	reside-01
# ::edge	person	ARG0-of	architect-01	1	2	
# ::edge	architect-01	ARG1	facility	2	3	
# ::edge	reside-01	mod	also	6	5	
# ::edge	reside-01	ARG0	person	6	1	
# ::edge	reside-01	ARG1	company	6	7	
# ::edge	facility	name	name	3	10	
# ::edge	name	op1	"Marine"	10	11	
# ::edge	name	op2	"Corps"	10	12	
# ::edge	name	op3	"Air"	10	13	
# ::edge	name	op4	"Station"	10	14	
# ::edge	name	op5	"Kaneohe"	10	15	
# ::edge	name	op6	"Bay"	10	16	
# ::edge	company	name	name	7	18	
# ::edge	name	op1	"New"	18	19	
# ::edge	name	op2	"Sanno"	18	20	
# ::edge	name	op3	"Hotel"	18	21	
# ::short	{1: 'p', 2: 'a', 3: 'f', 5: 'a2', 6: 'r', 7: 'c', 10: 'n', 11: 'x0', 12: 'x1', 13: 'x2', 14: 'x3', 15: 'x4', 16: 'x5', 18: 'n2', 19: 'x6', 20: 'x7', 21: 'x8'}	
(r / reside-01
      :ARG0 (p / person
            :ARG0-of (a / architect-01
                  :ARG1 (f / facility
                        :name (n / name
                              :op1 "Marine"
                              :op2 "Corps"
                              :op3 "Air"
                              :op4 "Station"
                              :op5 "Kaneohe"
                              :op6 "Bay"))))
      :ARG1 (c / company
            :name (n2 / name
                  :op1 "New"
                  :op2 "Sanno"
                  :op3 "Hotel"))
      :mod (a2 / also))

Can't download bart model

When I run bash run/run_experiment.sh configs/amr2.0-structured-bart-large-sep-voc.sh (or bash tests/minimal_test.sh) I get the following error..

Downloading: "https://github.com/pytorch/fairseq/archive/master.zip" to /home/bjascob/.cache/torch/hub/master.zip
Traceback (most recent call last):
  File "fairseq_ext/preprocess_bartsv.py", line 331, in <module>
    cli_main()
  File "fairseq_ext/preprocess_bartsv.py", line 327, in cli_main
    main(args)
  File "fairseq_ext/preprocess_bartsv.py", line 290, in main
    make_bart_encodings(args, tokenize=tokenize)
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/fairseq_ext/extract_bart/binarize_encodings.py", line 152, in make_bart_encodings
    make_binary_bert_features(args, args.trainpref, "train", tokenize)
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/fairseq_ext/extract_bart/binarize_encodings.py", line 62, in make_binary_bert_features
    pretrained_embeddings = SentenceEncodingBART(
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/fairseq_ext/extract_bart/sentence_encoding.py", line 111, in __init__
    self.model = torch.hub.load('pytorch/fairseq', name)
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/torch/hub.py", line 358, in load
    repo_dir = _get_cache_or_reload(github, force_reload, verbose)
...etc...

The download https://github.com/pytorch/fairseq/archive/master.zip is not a valid location. Since this is buried deep in the torch hub download code I'm thinking this is a compatibility issue between the old torch 1.4.0 version being used and the current torch 1.10 version? Any idea the best way to get this to work?

Did you release the code of (Drozdov et al 2022) ?

Thanks for releasing the code. It's very helpful. However I've having trouble in finding the code mentioned in (Drozdov et al 2022).

I've carefully looked into the README.md of four branches structured-mbart, action-pointer, stack-transformer as well as the master branch, but none of them contains the instructions for (Drozdov et al 2022). Did you the code of (Drozdov et al 2022)? I would be very grateful if you could help me find it.

Thank you for your time to this matter.

trained checkpoints

I was wondering whether or not I can download the checkpoint of the trained model mention in the paper ?

Disconnected Graphs Error

Hi @ramon-astudillo ,

I get a very concerning error:
image
I have no idea how to interpret it, because I have rerun it on the same data multiple times. Half the time I get this error, other times I do not..

Best,
Zoher

Question about "bash tests/minimal_test.sh"

"TypeError: 'NoneType' object is not subscriptable" when I run this instruction.

Here is the detail:

| Wrote preprocessed oracle data to DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//
| Wrote preprocessed embedding data to DATA/wiki25/embeddings/RoBERTa-large-top24
[Training:]
[Configuration file:]
configs/wiki25.sh
| distributed init (rank 0): tcp://localhost:16604
| distributed init (rank 5): tcp://localhost:16604
| initialized host localhost as rank 5
| distributed init (rank 6): tcp://localhost:16604
| initialized host localhost as rank 6
| distributed init (rank 3): tcp://localhost:16604
| initialized host localhost as rank 3
| distributed init (rank 4): tcp://localhost:16604
| initialized host localhost as rank 4
| distributed init (rank 1): tcp://localhost:16604
| initialized host localhost as rank 1
| distributed init (rank 2): tcp://localhost:16604
| initialized host localhost as rank 2
| distributed init (rank 7): tcp://localhost:16604
| initialized host localhost as rank 7
| initialized host localhost as rank 0
Namespace(activation_dropout=0.0, activation_fn='relu', adam_betas='(0.9,0.98)', adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff=None, adaptive_softmax_dropout=0, append_eos_to_target=0, apply_tgt_actnode_masks=0, apply_tgt_input_src=0, apply_tgt_src_align=1, apply_tgt_vocab_masks=1, arch='transformer_tgt_pointer_graphmp', attention_dropout=0.0, bert_backprop=False, best_checkpoint_metric='loss', bpe=None, bucket_cap_mb=25, clip_norm=0.0, collate_tgt_states=1, cpu=False, criterion='label_smoothed_cross_entropy_pointer', curriculum=0, data='DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//', dataset_impl=None, ddp_backend='c10d', decoder_attention_heads=4, decoder_embed_dim=256, decoder_embed_path=None, decoder_ffn_embed_dim=512, decoder_input_dim=256, decoder_layers=6, decoder_learned_pos=False, decoder_normalize_before=False, decoder_output_dim=256, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method='tcp://localhost:16604', distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=8, dropout=0.3, emb_dir='DATA/wiki25/embeddings/RoBERTa-large-top24', encode_state_machine=None, encoder_attention_heads=4, encoder_embed_dim=256, encoder_embed_path=None, encoder_ffn_embed_dim=512, encoder_layers=6, encoder_learned_pos=False, encoder_normalize_before=False, find_unused_parameters=False, fix_batches_to_gpus=False, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, keep_interval_updates=-1, keep_last_epochs=6, label_smoothing=0.01, lazy_load=False, left_pad_source='True', left_pad_target='False', log_format='json', log_interval=1000, loss_coef=1.0, lr=[0.0005], lr_scheduler='inverse_sqrt', max_epoch=10, max_sentences=None, max_sentences_valid=None, max_source_positions=1024, max_target_positions=1024, max_tokens=3584, max_tokens_valid=3584, max_update=0, maximize_best_checkpoint_metric=False, memory_efficient_fp16=False, min_loss_scale=0.0001, min_lr=1e-09, no_bert_precompute=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_progress_bar=False, no_save=False, no_save_optimizer_state=False, no_token_positional_embeddings=False, num_workers=1, optimizer='adam', optimizer_overrides='{}', pointer_dist_decoder_selfattn_avg=0, pointer_dist_decoder_selfattn_heads=1, pointer_dist_decoder_selfattn_infer=5, pointer_dist_decoder_selfattn_layers=[5], pretrained_embed_dim=1024, raw_text=False, required_batch_size_multiple=8, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', save_dir='DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42', save_interval=1, save_interval_updates=0, seed=42, sentence_avg=False, share_all_embeddings=False, share_decoder_input_output_embed=0, shift_pointer_value=1, skip_invalid_size_inputs_valid_test=False, source_lang=None, target_lang=None, task='amr_action_pointer_graphmp', tbmf_wrapper=False, tensorboard_logdir='DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42', tgt_factored_emb_out=0, tgt_graph_heads=2, tgt_graph_layers=[0, 1, 2], tgt_graph_mask='allprev_1in1out', tgt_input_src_backprop=1, tgt_input_src_combine='add', tgt_input_src_emb='top', tgt_src_align_focus=['p0c1n0', 'p0c0n*'], tgt_src_align_heads=2, tgt_src_align_layers=[0, 1, 2, 3, 4, 5], threshold_loss_scale=None, tokenizer=None, train_subset='train', update_freq=[1], upsample_primary=1, use_bmuf=False, user_dir='../fairseq_ext', valid_subset='valid', validate_interval=1, warmup_init_lr=1e-07, warmup_updates=4000, weight_decay=0.0)
| [en] dictionary: 248 types
| [actions_nopos] dictionary: 128 types
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.en
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.bert
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.wordpieces
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/valid.en-actions.en.wp2w
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.nopos_in
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.nopos_out
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.pos
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.vocab_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.src_cursors
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_1stnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_cur_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_cur_1stnode_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_directions
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//valid.en-actions.actions.actedge_allpre_directions
| model transformer_tgt_pointer_graphmp, criterion LabelSmoothedCrossEntropyPointerCriterion
| num. model params: 8298496 (num. trained: 8298496)
| training on 8 GPUs
| max tokens per GPU = 3584 and max sentences per GPU = None
| no existing checkpoint found DATA/wiki25/models/exp_cofill_o8.3_act-states_RoBERTa-large-top24/_act-pos-grh_vmask1_shiftpos1_ptr-lay6-h1_grh-lay123-h2-allprev_1in1out_cam-layall-h2-abuf/ep10-seed42/checkpoint_last.pt
| loading train data for epoch 0
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.en
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/train.en-actions.en.bert
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/train.en-actions.en.wordpieces
| loaded 25 examples from: DATA/wiki25/embeddings/RoBERTa-large-top24/train.en-actions.en.wp2w
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.nopos_in
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.nopos_out
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.pos
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.vocab_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.src_cursors
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_1stnode_masks
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_cur_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_cur_1stnode_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_directions
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_allpre_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_allpre_pre_node_indexes
| loaded 25 examples from: DATA/wiki25/features/cofill_o8.3_act-states_RoBERTa-large-top24//train.en-actions.actions.actedge_allpre_directions
/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
Traceback (most recent call last):
File "fairseq_ext/train.py", line 341, in
cli_main()
File "fairseq_ext/train.py", line 333, in cli_main
nprocs=args.distributed_world_size,
File "/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 5 terminated with the following error:
Traceback (most recent call last):
File "/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/data4/yhchen/transition-amr-parser/fairseq_ext/train.py", line 297, in distributed_main
main(args, init_distributed=True) # 分布式训练
File "/data4/yhchen/transition-amr-parser/fairseq_ext/train.py", line 103, in main
train(args, trainer, task, epoch_itr)
File "/data4/yhchen/transition-amr-parser/fairseq_ext/train.py", line 149, in train
log_output = trainer.train_step(samples)
File "/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/site-packages/fairseq/trainer.py", line 264, in train_step
ignore_grad
File "/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/site-packages/fairseq_ext/tasks/amr_action_pointer_graphmp.py", line 462, in train_step
loss, sample_size, logging_output = criterion(model, sample)
File "/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/data4/yhchen/miniconda3/envs/AMR/lib/python3.6/site-packages/fairseq_ext/criterions/label_smoothed_cross_entropy_pointer.py", line 104, in forward
net_output = model(**sample['net_input'])
TypeError: 'NoneType' object is not subscriptable

Infinite loss when fine-tuning

I'm trying to fine-tune the AMR3.0 large SBART checkpoint on another dataset, but during training I get the following warnings:

2023-04-29 00:02:05 | WARNING | tensorboardX.x2num | NaN or Inf found in input tensor.
2023-04-29 00:02:05 | WARNING | tensorboardX.x2num | NaN or Inf found in input tensor.
2023-04-29 00:02:05 | WARNING | tensorboardX.x2num | NaN or Inf found in input tensor.
2023-04-29 00:02:05 | WARNING | tensorboardX.x2num | NaN or Inf found in input tensor.
2023-04-29 00:02:05 | INFO | train | {"epoch": 1, "train_loss": "inf", "train_nll_loss": "inf", "train_loss_seq": "inf", "train_nll_loss_seq": "inf", "train_loss_pos": "0.710562", "train_nll_loss_pos": "0.710562", "train_wps": "687.9", "train_ups": "0.51", "train_wpb": "1354.7", "train_bsz": "55.2", "train_num_updates": "71", "train_lr": "1.87323e-06", "train_gnorm": "17.868", "train_loss_scale": "8", "train_train_wall": "45", "train_wall": "158"}

In my config I set the fairseq-preprocess arguments as:

FAIRSEQ_PREPROCESS_FINETUNE_ARGS="--srcdict /content/DATA/AMR3.0/models/amr3.0-structured-bart-large-neur-al/seed42/dict.en.txt --tgtdict /content/DATA/AMR3.0/models/amr3.0-structured-bart-large-neur-al/seed42/dict.actions_nopos.txt"

and train args as:

FAIRSEQ_TRAIN_FINETUNE_ARGS="--finetune-from-model /content/DATA/AMR3.0/models/amr3.0-structured-bart-large-neur-al/seed42/checkpoint_wiki.smatch_top5-avg.pt --memory-efficient-fp16 --batch-size 16 --max-tokens 512 --patience 10"

Any ideas as to what I'm doing wrong?
Thanks in advance.

ZeroDivisionError: division by zero running 'amr-parse' command

Full trace.

amr-parse -c /home/aianta/transition-amr-parser/DATA/AMR2.0/models/amr2.0-structured-bart-large-neur-al/seed42/checkpoint_wiki.smatch_top5-avg.pt -i test_file.txt -o test_out.txt
| [en] dictionary: 34112 types
| [actions_nopos] dictionary: 12832 types
----------loading pretrained bart.large model ----------
Downloading: "https://github.com/pytorch/fairseq/archive/main.zip" to /home/aianta/.cache/torch/hub/main.zip
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3699866548/3699866548 [01:02<00:00, 59519919.52B/s]
---------- task bart rewind: loading pretrained bart.large model ----------
Using cache found in /home/aianta/.cache/torch/hub/pytorch_fairseq_main
using GPU for models
pretrained_embed:  bart.large
Using cache found in /home/aianta/.cache/torch/hub/pytorch_fairseq_main
Using bart.large extraction in GPU
Finished loading models
self.machine_config:  /home/aianta/transition-amr-parser/DATA/AMR2.0/models/amr2.0-structured-bart-large-neur-al/seed42/machine_config.json
Total time taken to load parser: 0:02:23.460215
Parsing 1 sentences
Running on batch size: 1
1
decoding:   0%|                                                                                                                                                                 | 0/1 [00:00<?, ?it/s]/home/aianta/anaconda3/envs/ibm-amr/lib/python3.7/site-packages/fairseq/search.py:140: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  beams_buf = indices_buf // vocab_size
decoding: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.40it/s]
Traceback (most recent call last):
  File "/home/aianta/anaconda3/envs/ibm-amr/bin/amr-parse", line 33, in <module>
    sys.exit(load_entry_point('transition-amr-parser', 'console_scripts', 'amr-parse')())
  File "/home/aianta/transition-amr-parser/transition_amr_parser/parse.py", line 756, in main
    sents_per_second = num_sent / time_secs.seconds
ZeroDivisionError: division by zero

Went to line 756 in parse.py and changed:

sents_per_second = num_sent / time_secs.seconds

to

sents_per_second = num_sent / (time_secs.seconds if time_secs.seconds != 0 else 0.1 )

And everything worked fine.

Reproduction Problem: acquiring 0.4 in Smatch with amr2.0-structured-bart-large-neur-al-seed42

Thanks for releasing the code and the pre-trained weights. It's very helpful. However I've having trouble in reproducing the result with the pre-trained weights released in trained-checkpoints.. I'll detail the steps and some text in the following. Thank you in advance!

my steps

  1. Decode with Pre-trained model
in_checkpoint="/app/transition-amr-parser/DATA/amr_align_2022/DATA/AMR2.0/models/amr2.0-structured-bart-large-neur-al/seed42/checkpoint_wiki.smatch_top5-avg.pt"
input_file="/app/transition-amr-parser/DATA/AMR2.0/DATA/AMR2.0/corpora/test_sentence.txt"
amr-parse -c $in_checkpoint -i $input_file -o /app/transition-amr-parser/DATA/OUTPUT/test_output.amr
  1. compute smatch
smatch.py -f /app/transition-amr-parser/DATA/OUTPUT/test_output.amr /app/transition-amr-parser/DATA/AMR2.0/corpora/test.txt
# F-score: 0.28
  1. aware of --no-isi

  2. Decode with Pre-trained model with --no-isi flag and compute smatch

amr-parse -c $in_checkpoint -i $input_file -o /app/transition-amr-parser/DATA/OUTPUT/test_output_1.amr --no-isi

smatch.py -f /app/transition-amr-parser/DATA/OUTPUT/test_output_1.amr /app/transition-amr-parser/DATA/AMR2.0/corpora/test.txt

# F-score: 0.40

examples from text files

input_file

line 8-9 from

input_file="/app/transition-amr-parser/DATA/AMR2.0/DATA/AMR2.0/corpora/test_sentence.txt"
How Long are We Going to Tolerate Japan ?
My fellow citizens :

output_file

corresponding prediction from input_file

/app/transition-amr-parser/DATA/OUTPUT/test_output_1.amr 
# ::tok How Long are We Going to Tolerate Japan ?
# ::alignments a~0 w~3 t~6 c~7 n~7 0~7
(t / tolerate-01
    :ARG0 (w / we)
    :ARG1 (c / country
        :name (n / name
            :op1 "Japan"))
    :duration (a / amr-unknown))

# ::tok My fellow citizens :
# ::alignments i~0 f~1 c~2
(c / citizen
    :mod (f / fellow)
    :poss (i / i))

Setting the environment

Hi, I'm having problems running the bash tests/minimal_test.sh command. I am not sure why this is happening but I believe it starts when I try to set the environment.
I already added execution permissions for the set_environment.sh by running chmod a+x set_environment.sh but when I try to execute it, nothing seems to happen:

(base) luis@luis-BOHK-WAX9X:~/transition-amr-parser$ touch set_environment.sh
(base) luis@luis-BOHK-WAX9X:~/transition-amr-parser$ . set_environment.sh
(base) luis@luis-BOHK-WAX9X:~/transition-amr-parser$

Futhermore, running bash scripts/download_and_patch_fairseq.sh seems that it is also doing nothing:

(base) luis@luis-BOHK-WAX9X:~/transition-amr-parser$ bash scripts/download_and_patch_fairseq.sh
(base) luis@luis-BOHK-WAX9X:~/transition-amr-parser$ 

Everything is installed properly:

base) luis@luis-BOHK-WAX9X:~/transition-amr-parser$ python tests/correctly_installed.py

pytorch 1.8.1+cu102
fairseq 0.7.2
spacy 2.2.3
[OK] correctly installed

But when I run bash tests/minimal_test.sh it does not test (I will leave the full message for convenience).

[normalize rules] months
[normalize rules] units
[normalize rules] cardinals
[normalize rules] ordinals
Read DATA/wiki25.jkaln
25 sentences
216/293 node types/tokens
35/285 edge types/tokens
241/383 word types/tokens
AMR contains 4 duplicate edges
{'ARG1': 4}
2021-05-13 11:33:55 [amr] Processing oracle
2021-05-13 11:33:55 [oracle] Parsing data
computing oracle: 100%|█████████████████████████| 25/25 [00:00<00:00, 92.61it/s]
2021-05-13 11:33:56 [oracle] Done
Not whitelisted actions used e.g. arcs for unconfirmed words
Counter({'LA': 2})
Blacklisted actions used e.g. duplicated edges
Counter({'RA': 1})
There were 18 disconnected nodes (:rel)
2021-05-13 11:33:56 [Totals:] 0.61
2021-05-13 11:33:56 [Totals:] Failed Entity Predictions:
Namespace(alignfile=None, batch_normalize_reward=False, bert_layers=None, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='DATA.tests/features/wiki25.jkaln/', entity_rules='DATA.tests/oracles/wiki25.jkaln//entity_rules.json', fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, gold_annotations=None, gold_episode_ratio=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', machine_rules='DATA.tests/oracles/wiki25.jkaln//train.rules.json', machine_type='AMR', memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer='nag', padding_factor=8, pretrained_embed='roberta.base', seed=1, source_lang='en', srcdict=None, target_lang='actions', task='translation', tbmf_wrapper=False, tensorboard_logdir='', testpref='DATA.tests/oracles/wiki25.jkaln//test', tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenize_by_whitespace=False, tokenizer=None, trainpref='DATA.tests/oracles/wiki25.jkaln//train', user_dir=None, validpref='DATA.tests/oracles/wiki25.jkaln//dev', workers=1)
| [en] Dictionary: 247 types
| [en] DATA.tests/oracles/wiki25.jkaln//train.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
| [en] Dictionary: 247 types
| [en] DATA.tests/oracles/wiki25.jkaln//dev.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
| [en] Dictionary: 247 types
| [en] DATA.tests/oracles/wiki25.jkaln//test.en: 25 sents, 408 tokens, 0.0% replaced by <unk>
| [actions] Dictionary: 127 types
| [actions] DATA.tests/oracles/wiki25.jkaln//train.actions: 25 sents, 1327 tokens, 0.0% replaced by <unk>
| [actions] Dictionary: 127 types
| [actions] DATA.tests/oracles/wiki25.jkaln//dev.actions: 25 sents, 1327 tokens, 0.0% replaced by <unk>
| [actions] Dictionary: 127 types
| [actions] DATA.tests/oracles/wiki25.jkaln//test.actions: 25 sents, 1327 tokens, 0.0% replaced by <unk>
Using cache found in /home/luis/.cache/torch/hub/pytorch_fairseq_master
fatal: not a git repository (or any of the parent directories): .git
running build_ext
/home/luis/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py:369: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
skipping 'fairseq/data/data_utils_fast.cpp' Cython extension (up-to-date)
skipping 'fairseq/data/token_block_utils_fast.cpp' Cython extension (up-to-date)
copying build/lib.linux-x86_64-3.8/fairseq/libbleu.cpython-38-x86_64-linux-gnu.so -> fairseq
copying build/lib.linux-x86_64-3.8/fairseq/data/data_utils_fast.cpython-38-x86_64-linux-gnu.so -> fairseq/data
copying build/lib.linux-x86_64-3.8/fairseq/data/token_block_utils_fast.cpython-38-x86_64-linux-gnu.so -> fairseq/data
copying build/lib.linux-x86_64-3.8/fairseq/libbase.cpython-38-x86_64-linux-gnu.so -> fairseq
copying build/lib.linux-x86_64-3.8/fairseq/libnat.cpython-38-x86_64-linux-gnu.so -> fairseq
loading archive file http://dl.fbaipublicfiles.com/fairseq/models/roberta.base.tar.gz from cache at /home/luis/.cache/torch/pytorch_fairseq/37d2bc14cf6332d61ed5abeb579948e6054e46cc724c7d23426382d11a31b2d6.ae5852b4abc6bf762e0b6b30f19e741aa05562471e9eb8f4a6ae261f04f9b350
| dictionary: 50264 types
Using roberta.base extraction in cpu (slow, wont OOM)

25it [00:00, 84.66it/s]

There were missing actions
Counter({'LA(op1)': 1, 'LA(domain)': 1, 'RA(ARG1)': 1})
Using cache found in /home/luis/.cache/torch/hub/pytorch_fairseq_master
fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
  File "/home/luis/.cache/torch/hub/pytorch_fairseq_master/hubconf.py", line 49, in <module>
    import fairseq.data.token_block_utils_fast  # noqa
ModuleNotFoundError: No module named 'fairseq.data.token_block_utils_fast'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 152, in save_modules
    yield saved
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 193, in setup_context
    yield
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 254, in run_setup
    _execfile(setup_script, ns)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 43, in _execfile
    exec(code, globals, locals)
  File "/home/luis/.cache/torch/hub/pytorch_fairseq_master/setup.py", line 268, in <module>
    do_setup(package_data)
  File "/home/luis/.cache/torch/hub/pytorch_fairseq_master/setup.py", line 179, in do_setup
    setup(
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/home/luis/anaconda3/lib/python3.8/distutils/core.py", line 134, in setup
    ok = dist.parse_command_line()
  File "/home/luis/anaconda3/lib/python3.8/distutils/dist.py", line 483, in parse_command_line
    args = self._parse_command_opts(parser, args)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/dist.py", line 903, in _parse_command_opts
    nargs = _Distribution._parse_command_opts(self, parser, args)
  File "/home/luis/anaconda3/lib/python3.8/distutils/dist.py", line 546, in _parse_command_opts
    raise DistutilsClassError(
distutils.errors.DistutilsClassError: command class <class 'torch.utils.cpp_extension.BuildExtension'> must subclass Command

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/luis/anaconda3/bin/fairseq-preprocess", line 33, in <module>
    sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-preprocess')())
  File "/home/luis/transition-amr-parser/fairseq-stack-transformer/fairseq_cli/preprocess.py", line 295, in cli_main
    main(args)
  File "/home/luis/transition-amr-parser/fairseq-stack-transformer/fairseq_cli/preprocess.py", line 212, in main
    make_state_machine(args, src_dict, tgt_dict, tokenize=tokenize)
  File "/home/luis/transition-amr-parser/transition_amr_parser/stack_transformer/preprocess.py", line 300, in make_state_machine
    make_binary_bert_features(args, validpref, outprefix, src_dict.eos_index, src_dict.pad_index, tokenize)
  File "/home/luis/transition-amr-parser/transition_amr_parser/stack_transformer/preprocess.py", line 210, in make_binary_bert_features
    pretrained_embeddings = PretrainedEmbeddings(
  File "/home/luis/transition-amr-parser/transition_amr_parser/stack_transformer/pretrained_embeddings.py", line 180, in __init__
    self.roberta = torch.hub.load('pytorch/fairseq', name)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/torch/hub.py", line 339, in load
    model = _load_local(repo_or_dir, model, *args, **kwargs)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/torch/hub.py", line 365, in _load_local
    hub_module = import_module(MODULE_HUBCONF, hubconf_path)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/torch/hub.py", line 74, in import_module
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/luis/.cache/torch/hub/pytorch_fairseq_master/hubconf.py", line 56, in <module>
    sandbox.run_setup(
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 257, in run_setup
    raise
  File "/home/luis/anaconda3/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 193, in setup_context
    yield
  File "/home/luis/anaconda3/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 164, in save_modules
    saved_exc.resume()
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 139, in resume
    raise exc.with_traceback(self._tb)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 152, in save_modules
    yield saved
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 193, in setup_context
    yield
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 254, in run_setup
    _execfile(setup_script, ns)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/sandbox.py", line 43, in _execfile
    exec(code, globals, locals)
  File "/home/luis/.cache/torch/hub/pytorch_fairseq_master/setup.py", line 268, in <module>
    do_setup(package_data)
  File "/home/luis/.cache/torch/hub/pytorch_fairseq_master/setup.py", line 179, in do_setup
    setup(
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/home/luis/anaconda3/lib/python3.8/distutils/core.py", line 134, in setup
    ok = dist.parse_command_line()
  File "/home/luis/anaconda3/lib/python3.8/distutils/dist.py", line 483, in parse_command_line
    args = self._parse_command_opts(parser, args)
  File "/home/luis/anaconda3/lib/python3.8/site-packages/setuptools/dist.py", line 903, in _parse_command_opts
    nargs = _Distribution._parse_command_opts(self, parser, args)
  File "/home/luis/anaconda3/lib/python3.8/distutils/dist.py", line 546, in _parse_command_opts
    raise DistutilsClassError(
distutils.errors.DistutilsClassError: command class <class 'torch.utils.cpp_extension.BuildExtension'> must subclass Command

Any idea why this might be happening?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.