jasonforjoy / imn Goto Github PK

View Code? Open in Web Editor NEW

86.0 10.0 12.0 957 KB

CIKM 2019: Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots

Python 94.58% Shell 5.42%

multi-turn-dialogue response-selection chatbot

imn's People

Contributors

Stargazers

Watchers

Forkers

chunyuany ybz79 linxiyao zhujingdan ishine luweishuang kifish barryzm gnishtha canyuchen hcy123902 aguiar16

imn's Issues

congratulations

Congratulations to you for winning the top, thanks for the shared code, I wonder if you have used the pre-training algorithm to do multi-dialogue work?

Hello project owners,
I have been facing an issue that when the models is being trained on the Ubuntu V2 dataset it calculates loss=nan right from the very beginning and even after 2000 steps does not change. I have not changed the code at all except that I upgraded it to support latest python and tensorflow v2. I checked every possible scenario that can lead to nan values but it still wont calculate loss.

Any clues ?
Really appreciate your response.

关于模型筛选的问题

您好，请问一下在您的论文里所有的数据集都是通过MRR这个指标来在验证集上对模型进行筛选的吗？

训练步数

感谢分享代码！
请问你们最终用于测试的模型训练了多少步？

关于数据集中的候选response的来源？

您好，我详细的看了一下您的数据和代码，看到您给到的GoogleDriver里面的数据集格式是utterance + 候选response_id (48107|318604|102642 72282|209105|614088|537827|146615|12490|509973), 我想问一下这些个正例和负例都是怎么来的？初次涉猎这个任务，小白问题，望解答~

when running bash ubuntu_train.sh, one error is D:\Anaconda\python.exe: can't find 'main' module in 'D:/'

Just as the title describes, could you please tell me where I make the mistake?

The process get stuck after python eval.py

Hi,
I have run ' bash douban_train.sh ' , I get some results like

Then the process get stuck after I run ' bash douban_test.sh ' ,

There is no reason to be stuck because the next line is
'test_dataset = data_helpers.load_dataset(FLAGS.test_file, vocab, FLAGS.max_utter_len, FLAGS.max_utter_num, response_data)
print('test_pairs: {}'.format(len(test_dataset)))'

Embedding Glove + word2vec embedding

Good Day to you, Mr. Jia-Chen Gu.
Could you kindly share the code that you used to create this mixed embedding for the task with ubuntu dialog corpus v2?

关于responses.txt的句子的疑问

大神你好，我想问下负样例中的

红色下划线的id 也是responses.txt的 id索引吗？
我看有些case 好奇怪，感觉他们其实都是正类的。

Expect open code

Expect open code！

Graph Execution Error on running tensorflow session

Hello Mr. Jia Chen Gu!
I need help as I'm facing graph execution error when trying to train the model using Google Colab on Ubuntu Dialog Corpus V2. I am stuck at this point, although I previously trained the model once and the graph was running without any issues. It throws error as soon as the code 'sess.run(tf.compat.v1.global_variables_initializer())' is executed. I have not changed the working of existing code except that I upgraded it to tensorflow V2.6. The exact error description and its detail is pasted below:

============================================================
2023-02-24 06:35:24.445231: W tensorflow/c/c_api.cc:349] Operation '{name:'encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while' id:427 op device:{requested: '', assigned: ''} def:{{{node encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while}} = While[T=[DT_INT32, DT_INT32, DT_INT32, DT_VARIANT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_VARIANT, DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT], _lower_using_switch_merge=true, _num_original_outputs=48, _read_only_resource_inputs=[], _stateful_parallelism=false, body=encoding_layer_bidirectional_rnn_bidirectional_rnn0_bidirectional_rnn_bw_bw_while_body_484_rewritten[], cond=encoding_layer_bidirectional_rnn_bidirectional_rnn0_bidirectional_rnn_bw_bw_while_cond_483_rewritten[], output_shapes=[[], [], [], [], [?,200], [?,200], [], [], [?], [550,800], [800], , [?,200], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []], parallel_iterations=32](encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while/loop_counter, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/strided_slice_1, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/time, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/TensorArrayV2, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/DropoutWrapperZeroState/LSTMCellZeroState/zeros, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/DropoutWrapperZeroState/LSTMCellZeroState/zeros_1, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/Minimum, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorListFromTensor, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/CheckSeqLen, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/lstm_cell/kernel/Read/Identity, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/lstm_cell/bias/Read/Identity, dropout_keep_prob, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/zeros, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Select_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Placeholder_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Placeholder_2_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/GreaterEqual_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Placeholder_3_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Mul_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Mul_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/dropout/Cast_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/dropout/RealDiv_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/RealDiv_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/RealDiv_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/dropout/Sub_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/mul_2_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_2_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_2_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Tanh_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Sigmoid_2_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Sub_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Sub_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/sub_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/sub_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_1_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_1_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Sigmoid_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_1_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_1_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Tanh_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Sigmoid_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/concat_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/TensorArrayV2Read/TensorListGetItem_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/TensorArrayV2Read/TensorListGetItem_grad/TensorListElementShape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/TensorArrayV2Read/TensorListGetItem_grad/TensorListLength_0/accumulator:0)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1377, in _do_call
return fn(*args)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1359, in _run_fn
self._extend_graph()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1400, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'gradients/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad': Connecting to invalid output 13 of source node aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while which has 13 outputs. Try using tf.compat.v1.experimental.output_all_intermediates(True).

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 184, in
sess.run(tf.compat.v1.global_variables_initializer()) # added compat.v1. in existing code
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1190, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1396, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Node 'gradients/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad': Connecting to invalid output 13 of source node aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while which has 13 outputs. Try using tf.compat.v1.experimental.output_all_intermediates(True).

When running python train.py, there is nothing in checkpoints. & GPU use problem

OS：Ubuntu 16.04.6 LTS
1.
When running python train.py, there is nothing in checkpoints, there are some results about accuracy and loss.

2.
I tried:
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
or
CUDA_VISIBLE_DEVICES=0 ( The same way with your train.sh)
but my GPU isn't running.

Have you ever come across this kind of problem? Look forward to your reply!

Can't achieve the same results in your paper on Douban dataset

Hi Jason, thank you for sharing your source codes. I find the results are much worse than results in your paper on Douban dataset. I feel it is very strange. I just use the default parameter in your code. Could you please help me with that?

关于评价指标的计算

如 Douban corpus 的一些情况
情况：10个候选中没有一个是正类的。在你的代码中 recall 10@1是不加入去统计的，是这样子吗？

可以提供应用部分的代码或者推荐相关的repo吗？

我想将IMN应用到对话上，即一问一答这样的多轮对话的形式。我改写了 eval.py 里的代码，抽取test.txt 里面的机器人说的话建立responses.txt, 然后给出 query_sentence, 在responses.txt里匹配最高的得分作为应答。我发现这样下来代码速度特别慢，几乎不可用，any ideas?
https://github.com/luweishuang/IMN/blob/master/Ecommerce/model/infer.py

Batch Size impact on model performance

Hi @JasonForJoy can you please confirm if reducing the batch size (due to low-end GPU machine availability) can impact the performance values of the model ?

Secondly, every batch_size including the recommended (i.e. 96) is not a divisor of the total number of training, valid or test samples. As a result of which the last batch misses out on samples. Does this process contribute in lowering the recall and other metric values ?

请问可以提供英文部分word2vec词向量训练的代码吗

您好，首先非常感谢您分享的代码。另外请问可以提供英文部分word2vec词向量训练的代码吗？

a little lower results on MAP and MRR

Hi,
Thanks for you open code. I tried to reproduce your result in the paper using the code with all parameters unchanged, but I got lower result about 2~3 lower on MAP and MRR with Douban Corpus. Could you tell that after how many epoches that you got the best result and the time it took?

Best wishes,
Sjzwind

Hardware specifications for IMN model

Hello team @JasonForJoy,
Is there any document that lists the hardware requirements of the model, i.e. the minimum specs of your computer system to run the model using original UDC (900k training dialogs) on the default training parameters (i.e. 1000000 epochs and 128 batch_size with evaluation every 1000 step) ?

I have tried running it with a colab Pro account with a premium GPU A100 and high RAM enabled, using only a reduced dataset (around 10000 train dialogs) for just 4 epochs, and it took hell lot of time like around 15+ hrs just to just 4 epochs on this setting!

Am I missing something here that you guys can help me speed up the training time ??

Looking for fast and sincere help here.
Regards.

In your paper entitled 'Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots', the embedding dimension used is 200 on the original UDC with drop_out probability = 0.2 whereas in the 'train.sh' script at the path 'https://github.com/JasonForJoy/IMN/blob/master/Ubuntu_V1/scripts/ubuntu_train.sh' the embedding_dim used is 400 and a drop_out prob = 0.8.

Which one should be used to replicate the published results ?

Your response shall be much appreciated.

release the code？？

？？

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.