xing-hu / emse-deepcom Goto Github PK

View Code? Open in Web Editor NEW

118.0 118.0 62.0 653.09 MB

The dataset for EMSE-DeepCom

License: MIT License

Perl 0.03% Python 0.08% Shell 0.01% Java 0.34% TeX 0.01% NewLisp 99.55%

emse-deepcom's People

Contributors

Stargazers

Watchers

emse-deepcom's Issues

I had stop the training the model at 28 epoch and at a learning rate of 0.13 but I am not able to run the prediction could you kindly tell me is it necessary for us to train full 50 epoch before we train the model ? The reason is training 28 epoch it already cost my small budget and I like to see it working. Could you kindly explain this to me. Thank you

FileNotFoundError: [Errno 2] No such file or directory: '../emse-data/vocab.code'

Sorry. I can't find the file vocab.code anyway. Is there anything wrong? It's kind of you to reply.

DeepCom and DFS analysis

I just happen to notice that there is a link between the SBT and the DFS(depth first search ) sequence. While training the model for the limited set of code say 10000 methods(due to computation limitation) the BLUE-4 Score looks for both SBT and DFS based model are very similar with the blue score difference less than 1. Is it due to my limited method number or is it due the similarity of the DFS and the SBT code.

Why did you change the data set again?

I just processed the data set last week. Today, your data set on Google Drive is different.

Prediction (Output) file not getting generated

Prediction file not getting generated after the train the model.

I found the .code and .nl files, but I did not find the .sbt file. How do I get the .sbt file? Thank you very much for answering my question

Version problem about NLTK

I tried to reproduce the code but the evaluation result did not seem to be consistent with yours. May I ask which version of NLTK did you import in this code file? Thanks!

您好，在get_ast时，遇到javalang.parser.JavaSyntaxError应该如何处理

请教一下，在get_ast时，遇到javalang.parser.JavaSyntaxError无法生成一颗AST tree该如何处理

Missing config.yaml file

can u please add or leave a link to the config file that is used in main.py
Thanks

Why SBT data only uses “type” information in ast

In your ICPC2018 paper “Deep Code Comment Generation” shows the SBT included the value of ast but not only type？

could you tell me how the hybrid.out file is generated？

I'm a college student and I want to reproduce your experiment, thought I can train this model successfully, I don't know how to produce the hybrid.out file by myself.

Can I use the following command to generate this file? I don't know if it is right because I have spent more than three hours and still have no results.

python3 __main__.py config.yaml --decode --output ../emse-data/output/hybird.out

Whether there are more duplicate data in the test set and training set？

Hello, thank you for sharing the code and data set.
But I found 4463 pieces of data in the test set appeared in the training set. Was it deliberately designed like this?

Why the corpus_bleu is 0.0000?

I use data-v1, the data set you provide, to train this model. It looks strange that the results of corpus_bleu are always 0.0000, did I do something wrong ？

07/11 02:15:34 test score=0.0000 avg_score=0.3086
07/11 03:33:38 test score=0.0000 avg_score=0.3993
07/11 04:51:08 test score=0.0000 avg_score=0.3668

tensorflow.python.framework.errors_impl.AlreadyExistsError:

Hi,

I am finding this error of tensor flow and It was not possible to resolved the issue from our side.

error :
tensorflow.python.framework.errors_impl.AlreadyExistsError: Resource __per_step_24/gradients/decoder_nl/while/decod er_nl/decoder_nl/gru_cell/gates/strided_slice/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflo w19TemporaryVariableOp6TmpVarE [[{{node gradients/decoder_nl/while/decoder_nl/decoder_nl/gru_cell/gates/strided_slice/Enter_grad/Arithmet icOptimizer/AddOpsRewrite_Add/tmp_var}}]]

Trace Back--

Traceback (most recent call last): File "__main__.py", line 329, in <module> main() File "__main__.py", line 323, in main model.train(**config) File "/home/tjinpa2019/EMSE-DeepCom/source code/translation_model.py", line 392, in train self.train_step(loss_function=loss_function, **kwargs) File "/home/tjinpa2019/EMSE-DeepCom/source code/translation_model.py", line 444, in train_step update_baseline=True) File "/home/tjinpa2019/EMSE-DeepCom/source code/seq2seq_model.py", line 177, in step res = tf.get_default_session().run(output_feed, input_feed) File "/home/tjinpa2019/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/home/tjinpa2019/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _ru n feed_dict_tensor, options, run_metadata) File "/home/tjinpa2019/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do _run run_metadata) File "/home/tjinpa2019/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do _call raise type(e)(node_def, op, message)

Could you please resolve the issue as I believed it was not possible from my side.

Generate strange AST

@xing-hu
Hi, I want to ask you a question. When generating an AST from get_AST.py file, the AST tree obtained is strange. An example is shown below:
e.g. [{"id": 0, "type": "MethodDeclaration(annotations=[], body=[ReturnStatement(expression=MethodInvocation(arguments=[], member=size, postfix_operators=[], prefix_operators=[], qualifier=gralComponents, selectors=[], type_arguments=None), label=None)], documentation=None, modifiers={'public'}, name=numGeneratedSequences, parameters=[], return_type=BasicType(dimensions=[], name=int), throws=None, type_parameters=None)", "children": [1, 2], "value": "numGeneratedSequences"}, {"id": 1, "type": "BasicType(dimensions=[], name=int)", "value": "int"}, {"id": 2, "type": "ReturnStatement(expression=MethodInvocation(arguments=[], member=size, postfix_operators=[], prefix_operators=[], qualifier=gralComponents, selectors=[], type_arguments=None), label=None)", "children": [3], "value": "return"}, {"id": 3, "type": "MethodInvocation(arguments=[], member=size, postfix_operators=[], prefix_operators=[], qualifier=gralComponents, selectors=[], type_arguments=None)", "value": "gralComponents.size"}]

How to generate the AST described in the literature?
e.g. [{"id": 0, "type": "MethodDeclaration", "children": [1, 2], "value": "doesNotHaveIds"}, {"id": 1, "type": "BasicType", "value": "boolean"}, {"id": 2, "type": "ReturnStatement", "children": [3], "value": "return"}, {"id": 3, "type": "BinaryOperation", "children": [4, 7]}, {"id": 4, "type": "BinaryOperation", "children": [5, 6]}, {"id": 5, "type": "MethodInvocation", "value": "getIds"}, {"id": 6, "type": "Literal", "value": "null"}, {"id": 7, "type": "MethodInvocation", "children": [8, 9], "value": "getIds"}, {"id": 8, "type": "MethodInvocation", "value": "."}, {"id": 9, "type": "MethodInvocation", "value": "."} ]

The link of the dataset has broken

作者您好！
readme中的数据集链接失效了，导致无法下载vocab.code文件，能麻烦您更新一下链接吗？十分感谢！

Could you provide the source code for SBT data generated by AST

Thank you very much！

can you please upload a trained model? I would like to see the working and i cant train the model from scratch because of hardware constraints. thank you

The file get_ast.py is confusing and imcomplete

@xing-hu
In order to verify the function of the file get_ast.py, I input the train.token.code, which had 445,812 samples, to the get_ast.py. However, only 27,025 AST samples were generated after processing. It seemed that lots of code samples were filtered in try & except statement in the second function 'get_ast(file_name, w)', and the AST file was strange, too. An example is shown below.

e.g. [{"id": 0, "type": "MethodDeclaration(annotations=[], body=[ReturnStatement(expression=MethodInvocation(arguments=[], member=size, postfix_operators=[], prefix_operators=[], qualifier=gralComponents, selectors=[], type_arguments=None), label=None)], documentation=None, modifiers={'public'}, name=numGeneratedSequences, parameters=[], return_type=BasicType(dimensions=[], name=int), throws=None, type_parameters=None)", "children": [1, 2], "value": "numGeneratedSequences"}, {"id": 1, "type": "BasicType(dimensions=[], name=int)", "value": "int"}, {"id": 2, "type": "ReturnStatement(expression=MethodInvocation(arguments=[], member=size, postfix_operators=[], prefix_operators=[], qualifier=gralComponents, selectors=[], type_arguments=None), label=None)", "children": [3], "value": "return"}, {"id": 3, "type": "MethodInvocation(arguments=[], member=size, postfix_operators=[], prefix_operators=[], qualifier=gralComponents, selectors=[], type_arguments=None)", "value": "gralComponents.size"}]

On the other hand, when I input the train.source file and only executed the second function 'get_ast(file_name, w)', the code samples were not be filtered. All samples generated AST but the structure was still strange. I think both the first function 'process_source(file_name, save_file)' and the second function 'get_ast(file_name, w)' have some problems, or maybe train.token.code is not the input of the get_ast.py.

Why the corpus_bleu is 0.0001? I have trained the model.

decaying learning rate to :0.061
decaying learning rate to :0.058
step 46000 epoch 43 learning rate 0.058 step-time 3.065 loss 0.764
test eval loss:123.05
start decoding
corpus_bleu:0.0001 avg_score:0.2200

And where is the finally output? Is it in the model/eval/test.46000.out?
Thank you.

How to train model?

Could you tell me how to train model ?
Readme didn't say.

The data process code is incomplete

链接无效

As the limitation of LFS, the dataset can be downloaded from Google Drive (dataset version 1)

这个链接失效啦.

Q1: 请问这个链接里data有paper中的提到的SBT数据吗？
Q2：现有的代码得到的SBT看起来只用到了type，而不是type_value 的形式。详见https://github.com/xing-hu/EMSE-DeepCom/blob/master/data_utils/ast_traversal.py#L9
Q3：我用你share出来的代码处理了数据，跑出来的bleu4只有10几，无限增大epoch至4000也just到20，离paper中deepcom的38，h-deepcom的39差的好多。

please feel free to reply me . Thanks

Many Thanks

Ensheng

Please specify where is the code to generate SBT sequence of the AST

../config/default.yaml error Ask Question

I am trying to train the model through the google cloud but I getting the same error :

File "__main__.py", line 329, in <module> main() File "__main__.py", line 126, in main with open('../config/default.yaml') as f: FileNotFoundError: [Errno 2] No such file or directory: '../config/default.yaml' tjinpa2019@cloudshell:~/EMSE-DeepCom/source code (bamboo-truck-281120)$

Could you kindly help me in this as I am stuck with the problem for sometime now.

File get_ast.py returns too big a string.

Hi, it appears that python handles its print method differently now:
When file get_ast.py is run on a .java file, you get the full constructor with all its parameters returned when the json.dump method is called on the output variable, and we see a huge output in the json.

A small example will describe the issue:

tree = javalang.parse.parse("public class a {public static void main(String args[]){}}");
>>> for path, node in tree:
...     print(path,node)

gives the following output:

() CompilationUnit(imports=[], package=None, types=[ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)])
(CompilationUnit(imports=[], package=None, types=[ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)]), [ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)]) ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)
(CompilationUnit(imports=[], package=None, types=[ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)]), [ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)], ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None), [MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)]) MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)
(CompilationUnit(imports=[], package=None, types=[ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)]), [ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)], ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None), [MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None), [FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)]) FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)
(CompilationUnit(imports=[], package=None, types=[ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)]), [ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None)], ClassDeclaration(annotations=[], body=[MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], documentation=None, extends=None, implements=None, modifiers={'public'}, name=a, type_parameters=None), [MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None)], MethodDeclaration(annotations=[], body=[], documentation=None, modifiers={'public', 'static'}, name=main, parameters=[FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], return_type=None, throws=None, type_parameters=None), [FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)], FormalParameter(annotations=[], modifiers=set(), name=args, type=ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None), varargs=False)) ReferenceType(arguments=None, dimensions=[None], name=String, sub_type=None)

There's a relatively simple fix to this, but I'd want to make sure if this really is an issue at the moment.

FileNotFoundError: [Errno 2] No such file or directory: '../emse-data/vocab.sbt'

Hi :)
I'm a student studying NLP, especially neural machine translation.
So I read your DeepCom thesis, it was very interesting task.
I tried to execute your EMSE-DeepCom code, but there are some difficulties.

When I execute train.py, there are FileNotFoundError such as emse-data/vocab, test, train etc...
I made test, train.sbt file using ast_traversal.py but I can't vocab.sbt file only.

How to make it?
Should I make a vocab.json file?

This is error description.
Thank you 😀

(deepcom) gpuadmin@gpuadmin:~/ahjeong/EMSE-DeepCom/source code$ python3 main.py config.yaml --train -v
02/17 13:52:57 label: default
02/17 13:52:57 description:
default configuration
next line of description
last line
02/17 13:52:57 main.py config.yaml --train -v
02/17 13:52:57 commit hash 2f4e873
02/17 13:52:57 tensorflow version: 1.10.0
02/17 13:52:57 program arguments
02/17 13:52:57 aggregation_method 'sum'
02/17 13:52:57 align_encoder_id 0
02/17 13:52:57 allow_growth True
02/17 13:52:57 attention_type 'global'
02/17 13:52:57 attn_filter_length 0
02/17 13:52:57 attn_filters 0
02/17 13:52:57 attn_prev_word False
02/17 13:52:57 attn_size 128
02/17 13:52:57 attn_temperature 1.0
02/17 13:52:57 attn_window_size 0
02/17 13:52:57 average False
02/17 13:52:57 batch_mode 'standard'
02/17 13:52:57 batch_size 64
02/17 13:52:57 beam_size 5
02/17 13:52:57 bidir False
02/17 13:52:57 bidir_projection False
02/17 13:52:57 binary False
02/17 13:52:57 cell_size 256
02/17 13:52:57 cell_type 'GRU'
02/17 13:52:57 character_level False
02/17 13:52:57 checkpoints []
02/17 13:52:57 conditional_rnn False
02/17 13:52:57 config 'config.yaml'
02/17 13:52:57 convolutions None
02/17 13:52:57 data_dir '../emse-data'
02/17 13:52:57 debug False
02/17 13:52:57 decay_after_n_epoch 1
02/17 13:52:57 decay_every_n_epoch 1
02/17 13:52:57 decay_if_no_progress None
02/17 13:52:57 decoders [{'max_len': 30, 'name': 'nl'}]
02/17 13:52:57 description 'default configuration\nnext line of description\nlast line\n'
02/17 13:52:57 dev_prefix 'test'
02/17 13:52:57 early_stopping True
02/17 13:52:57 embedding_size 256
02/17 13:52:57 embeddings_on_cpu True
02/17 13:52:57 encoders [{'attention_type': 'global', 'max_len': 200, 'name': 'code'},
{'attention_type': 'global', 'max_len': 500, 'name': 'sbt'}]
02/17 13:52:57 ensemble False
02/17 13:52:57 eval_burn_in 0
02/17 13:52:57 feed_previous 0.0
02/17 13:52:57 final_state 'last'
02/17 13:52:57 freeze_variables []
02/17 13:52:57 generate_first True
02/17 13:52:57 gpu_id 6
02/17 13:52:57 highway_layers 0
02/17 13:52:57 initial_state_dropout 0.0
02/17 13:52:57 initializer None
02/17 13:52:57 input_layer_dropout 0.0
02/17 13:52:57 input_layers None
02/17 13:52:57 keep_best 5
02/17 13:52:57 keep_every_n_hours 0
02/17 13:52:57 label 'default'
02/17 13:52:57 layer_norm False
02/17 13:52:57 layers 1
02/17 13:52:57 learning_rate 0.5
02/17 13:52:57 learning_rate_decay_factor 0.95
02/17 13:52:57 len_normalization 1.0
02/17 13:52:57 log_file 'log.txt'
02/17 13:52:57 loss_function 'xent'
02/17 13:52:57 max_dev_size 0
02/17 13:52:57 max_epochs 100
02/17 13:52:57 max_gradient_norm 5.0
02/17 13:52:57 max_len 50
02/17 13:52:57 max_steps 600000
02/17 13:52:57 max_test_size 0
02/17 13:52:57 max_to_keep 1
02/17 13:52:57 max_train_size 0
02/17 13:52:57 maxout_stride None
02/17 13:52:57 mem_fraction 1.0
02/17 13:52:57 min_learning_rate 1e-06
02/17 13:52:57 model_dir '../emse-data/model/hybrid'
02/17 13:52:57 moving_average None
02/17 13:52:57 no_gpu False
02/17 13:52:57 optimizer 'sgd'
02/17 13:52:57 orthogonal_init False
02/17 13:52:57 output None
02/17 13:52:57 output_dropout 0.0
02/17 13:52:57 parallel_iterations 16
02/17 13:52:57 pervasive_dropout False
02/17 13:52:57 pooling_avg True
02/17 13:52:57 post_process_script None
02/17 13:52:57 pred_deep_layer False
02/17 13:52:57 pred_edits False
02/17 13:52:57 pred_embed_proj True
02/17 13:52:57 pred_maxout_layer True
02/17 13:52:57 purge False
02/17 13:52:57 raw_output False
02/17 13:52:57 read_ahead 1
02/17 13:52:57 remove_unk False
02/17 13:52:57 reverse_input True
02/17 13:52:57 rnn_feed_attn True
02/17 13:52:57 rnn_input_dropout 0.0
02/17 13:52:57 rnn_output_dropout 0.0
02/17 13:52:57 rnn_state_dropout 0.0
02/17 13:52:57 save False
02/17 13:52:57 score_function 'nltk_sentence_bleu'
02/17 13:52:57 script_dir 'scripts'
02/17 13:52:57 sgd_after_n_epoch None
02/17 13:52:57 sgd_learning_rate 1.0
02/17 13:52:57 shuffle True
02/17 13:52:57 softmax_temperature 1.0
02/17 13:52:57 steps_per_checkpoint 2000
02/17 13:52:57 steps_per_eval 2000
02/17 13:52:57 swap_memory True
02/17 13:52:57 tie_embeddings False
02/17 13:52:57 time_pooling None
02/17 13:52:57 train True
02/17 13:52:57 train_initial_states True
02/17 13:52:57 train_prefix 'train'
02/17 13:52:57 truncate_lines True
02/17 13:52:57 update_first False
02/17 13:52:57 use_dropout False
02/17 13:52:57 use_lstm_full_state False
02/17 13:52:57 use_previous_word True
02/17 13:52:57 verbose True
02/17 13:52:57 vocab_prefix 'vocab'
02/17 13:52:57 weight_scale None
02/17 13:52:57 word_dropout 0.0
02/17 13:52:57 python random seed: 8107473215777315132
02/17 13:52:57 tf random seed: 2144161570693003299
02/17 13:52:57 creating model
02/17 13:52:57 using device: /gpu:6
02/17 13:52:57 copying vocab to ../emse-data/model/hybrid/data/vocab.sbt
Traceback (most recent call last):
File "main.py", line 329, in
main()
File "main.py", line 279, in main
model = TranslationModel(**config)
File "/home/gpuadmin/ahjeong/EMSE-DeepCom/source code/translation_model.py", line 63, in init
ref_ext=ref_ext, binary=self.binary, **kwargs)
File "/home/gpuadmin/ahjeong/EMSE-DeepCom/source code/utils.py", line 225, in get_filenames
shutil.copy(src, dest)
File "/home/gpuadmin/anaconda3/envs/deepcom/lib/python3.5/shutil.py", line 241, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/home/gpuadmin/anaconda3/envs/deepcom/lib/python3.5/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '../emse-data/vocab.sbt'

How to get the vocab.nl with a new dataset?

I use deepcom to train a new dataset, but i can't get the same as your vocab.nl .

Could you please share the dataset?

Hi, could you please share the dataset through google drive or dropbox for reproducing the performance? Thank you very much!
Really need your help!

xing-hu / emse-deepcom Goto Github PK

emse-deepcom's People

Contributors

Stargazers

Watchers

Forkers

emse-deepcom's Issues

Recommend Projects

Recommend Topics

Recommend Org