galsang / abcnn Goto Github PK
View Code? Open in Web Editor NEWImplementation of ABCNN(Attention-Based Convolutional Neural Network) on Tensorflow
Implementation of ABCNN(Attention-Based Convolutional Neural Network) on Tensorflow
==================================================
test data size: 10
2018-04-08 12:11:47.494259: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-04-08 12:11:47.494336: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-04-08 12:11:47.494347: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-04-08 12:11:47.494355: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-04-08 12:11:47.494364: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
status, run_metadata)
File "/usr/local/lib/python3.4/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [40,300] rhs shape= [96,300]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@CNN-1/aW"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](CNN-1/aW, save/RestoreV2)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 147, in
classifier=params["classifier"], word2vec=params["word2vec"])
File "test.py", line 39, in test
saver.restore(sess, model_path + "-" + str(e))
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/training/saver.py", line 1548, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [40,300] rhs shape= [96,300]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@CNN-1/aW"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](CNN-1/aW, save/RestoreV2)]]
Caused by op 'save/Assign', defined at:
File "test.py", line 147, in
classifier=params["classifier"], word2vec=params["word2vec"])
File "test.py", line 38, in test
saver = tf.train.Saver()
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/training/saver.py", line 1139, in init
self.build()
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/training/saver.py", line 1170, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/training/saver.py", line 691, in build
restore_sequentially, reshape)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/training/saver.py", line 155, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/ops/state_ops.py", line 271, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/ops/gen_state_ops.py", line 45, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [40,300] rhs shape= [96,300]
[[Node: save/Assign = Assign[T=DT_FLOAT, _class=["loc:@CNN-1/aW"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](CNN-1/aW, save/RestoreV2)]]
hi, there is a corpus which can help evaluate Networks for Question and Answering system.
I think it's no use in Answer selection.
hi,I see your paper's code can not find ,where can I find it please?
https://github.com/galsang/SG-BERT
Hi galsang, thank you for sharing code. I run your codes following the README. The cost tends to reduce like this:
[Epoch 4]
('[batch 100] cost:', 0.40089718)
('[batch 200] cost:', 0.27272141)
('[batch 300] cost:', 0.16169518)
('model saved as', './models/WikiQA-BCNN-2-4')
('LR saved as', './models/WikiQA-BCNN-2-4-LR.pkl')
('SVM saved as', './models/WikiQA-BCNN-2-4-SVM.pkl')
However, when I test, MAP and MRR equal 0. Moreover, the result file is empty. Do you know how to solve this problems or what need I to focus when I run the codes. Thank you very much!
('[Epoch 1] MAP:', 0, '/ MRR:', 0)
('./models/WikiQA-BCNN-2-2', 'restored.')
('./models/WikiQA-BCNN-2-2-LR.pkl', 'restored.')
('[Epoch 2] MAP:', 0, '/ MRR:', 0)
('./models/WikiQA-BCNN-2-3', 'restored.')
('./models/WikiQA-BCNN-2-3-LR.pkl', 'restored.')
('[Epoch 3] MAP:', 0, '/ MRR:', 0)
('./models/WikiQA-BCNN-2-4', 'restored.')
('./models/WikiQA-BCNN-2-4-LR.pkl', 'restored.')
ABCNN/MSRP_Corpus/msr_paraphrase_dev.txt
Line 100 in f95b9a1
I used my own data to train, and cost nan occured. I checked the data, clipped the gradient, and reduced the learning rate, it still occured at the same 'batch_size*batch' location. Do I have anything else to check or change to make it run normally? Thanks for your any suggestion.
the nan error looks like follows:
[batch 1044] cost: 2.06923
[batch 1045] cost: 1.79236
[batch 1046] cost: 1.9501
[batch 1047] cost: 1.86483
[batch 1048] cost: nan
[batch 1049] cost: nan
ValueError: Variable CNN-1/conv/weights does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope? Anyone has the same issue?
Traceback (most recent call last):
File "train.py", line 129, in
data_type=params["data_type"], word2vec=params["word2vec"])
File "train.py", line 26, in train
num_features=train_data.num_features, num_classes=num_classes, num_layers=num_layers)
File "/Users/yuanling/Downloads/ABCNN-master/ABCNN.py", line 176, in init
LI_1, LO_1, RI_1, RO_1 = CNN_layer(variable_scope="CNN-1", x1=x1_expanded, x2=x2_expanded, d=d0)
File "/Users/yuanling/Downloads/ABCNN-master/ABCNN.py", line 152, in CNN_layer
left_conv = convolution(name_scope="left", x=pad_for_wide_conv(x1), d=d)
File "/Users/yuanling/Downloads/ABCNN-master/ABCNN.py", line 64, in convolution
scope=scope
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 918, in convolution
outputs = layer.apply(inputs)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 320, in apply
return self.call(inputs, **kwargs)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 286, in call
self.build(input_shapes[0])
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/layers/convolutional.py", line 138, in build
dtype=self.dtype)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1049, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 948, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 349, in get_variable
validate_shape=validate_shape, use_resource=use_resource)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1389, in wrapped_custom_getter
*args, **kwargs)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 275, in variable_getter
variable_getter=functools.partial(getter, **kwargs))
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 228, in _add_variable
trainable=trainable and self.trainable)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1334, in layer_variable_getter
return _model_variable_getter(getter, *args, **kwargs)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1326, in _model_variable_getter
custom_getter=getter, use_resource=use_resource)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 262, in model_variable
use_resource=use_resource)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 217, in variable
use_resource=use_resource)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 341, in _true_getter
use_resource=use_resource)
File "/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 671, in _get_single_variable
"VarScope?" % name)
ValueError: Variable CNN-1/conv/weights does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?
Line 38 in f95b9a1
Thank you for this awesome repo.
This is not actually a code issue, I'm just curious to ask. Do you have any idea why do we need an extra linear model or SVM for the prediction? I mean this module doesn't go through the backpropagation at all.
Or do you find some improvements using this LR or SVM compared with its fully connected output layer?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.