Git Product home page Git Product logo

imn's People

Contributors

jasonforjoy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imn's Issues

congratulations

Congratulations to you for winning the top, thanks for the shared code, I wonder if you have used the pre-training algorithm to do multi-dialogue work?

MV-LST?

你好,论文中Table3所列出的MV-LST模型是笔误吗?MV-LSTM?

loss calculated as nan

Hello project owners,
I have been facing an issue that when the models is being trained on the Ubuntu V2 dataset it calculates loss=nan right from the very beginning and even after 2000 steps does not change. I have not changed the code at all except that I upgraded it to support latest python and tensorflow v2. I checked every possible scenario that can lead to nan values but it still wont calculate loss.

Any clues ?
Really appreciate your response.

关于模型筛选的问题

您好,请问一下在您的论文里所有的数据集都是通过MRR这个指标来在验证集上对模型进行筛选的吗?

训练步数

感谢分享代码!
请问你们最终用于测试的模型训练了多少步?

关于数据集中的候选response的来源?

您好,我详细的看了一下您的数据和代码,看到您给到的GoogleDriver里面的数据集格式是utterance + 候选response_id (48107|318604|102642 72282|209105|614088|537827|146615|12490|509973), 我想问一下这些个正例和负例都是怎么来的? 初次涉猎这个任务,小白问题,望解答~

The process get stuck after python eval.py

Hi,
I have run ' bash douban_train.sh ' , I get some results like
image

Then the process get stuck after I run ' bash douban_test.sh ' ,
image
There is no reason to be stuck because the next line is
'test_dataset = data_helpers.load_dataset(FLAGS.test_file, vocab, FLAGS.max_utter_len, FLAGS.max_utter_num, response_data)
print('test_pairs: {}'.format(len(test_dataset)))'

Embedding Glove + word2vec embedding

Good Day to you, Mr. Jia-Chen Gu.
Could you kindly share the code that you used to create this mixed embedding for the task with ubuntu dialog corpus v2?

关于responses.txt的句子的疑问

大神你好, 我想问下 负样例中的
image
红色下划线的id 也是responses.txt的 id索引吗?
我看有些case 好奇怪,感觉他们其实都是正类的。
image

Graph Execution Error on running tensorflow session

Hello Mr. Jia Chen Gu!
I need help as I'm facing graph execution error when trying to train the model using Google Colab on Ubuntu Dialog Corpus V2. I am stuck at this point, although I previously trained the model once and the graph was running without any issues. It throws error as soon as the code 'sess.run(tf.compat.v1.global_variables_initializer())' is executed. I have not changed the working of existing code except that I upgraded it to tensorflow V2.6. The exact error description and its detail is pasted below:

============================================================
2023-02-24 06:35:24.445231: W tensorflow/c/c_api.cc:349] Operation '{name:'encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while' id:427 op device:{requested: '', assigned: ''} def:{{{node encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while}} = While[T=[DT_INT32, DT_INT32, DT_INT32, DT_VARIANT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_VARIANT, DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT, DT_VARIANT], _lower_using_switch_merge=true, _num_original_outputs=48, _read_only_resource_inputs=[], _stateful_parallelism=false, body=encoding_layer_bidirectional_rnn_bidirectional_rnn0_bidirectional_rnn_bw_bw_while_body_484_rewritten[], cond=encoding_layer_bidirectional_rnn_bidirectional_rnn0_bidirectional_rnn_bw_bw_while_cond_483_rewritten[], output_shapes=[[], [], [], [], [?,200], [?,200], [], [], [?], [550,800], [800], , [?,200], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []], parallel_iterations=32](encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while/loop_counter, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/strided_slice_1, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/time, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/TensorArrayV2, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/DropoutWrapperZeroState/LSTMCellZeroState/zeros, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/DropoutWrapperZeroState/LSTMCellZeroState/zeros_1, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/Minimum, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/TensorArrayUnstack/TensorListFromTensor, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/CheckSeqLen, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/lstm_cell/kernel/Read/Identity, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/lstm_cell/bias/Read/Identity, dropout_keep_prob, encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/zeros, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Select_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Placeholder_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Placeholder_2_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/GreaterEqual_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/Placeholder_3_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Mul_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Mul_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/dropout/Cast_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/dropout/RealDiv_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/RealDiv_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/RealDiv_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/dropout/Sub_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/mul_2_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_2_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_2_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Tanh_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Sigmoid_2_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Sub_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/dropout/Sub_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/sub_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/sub_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_1_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_1_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Sigmoid_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_1_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/mul_1_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Tanh_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/Sigmoid_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_grad/Shape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/lstm_cell/add_grad/Shape_1_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/lstm_cell/concat_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/TensorArrayV2Read/TensorListGetItem_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/TensorArrayV2Read/TensorListGetItem_grad/TensorListElementShape_0/accumulator:0, gradients/encoding_layer/bidirectional_rnn/bidirectional_rnn0/bidirectional_rnn/bw/bw/while_grad/gradients/TensorArrayV2Read/TensorListGetItem_grad/TensorListLength_0/accumulator:0)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1377, in _do_call
return fn(*args)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1359, in _run_fn
self._extend_graph()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1400, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'gradients/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad': Connecting to invalid output 13 of source node aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while which has 13 outputs. Try using tf.compat.v1.experimental.output_all_intermediates(True).

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 184, in
sess.run(tf.compat.v1.global_variables_initializer()) # added compat.v1. in existing code
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1190, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1396, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Node 'gradients/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad/aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while_grad': Connecting to invalid output 13 of source node aggregation_layer/bidirectional_rnn_aggregation/bidirectional_rnn/fw/fw/while which has 13 outputs. Try using tf.compat.v1.experimental.output_all_intermediates(True).

When running python train.py, there is nothing in checkpoints. & GPU use problem

OS:Ubuntu 16.04.6 LTS
1.rBAoL13l_oiAK60IAAEiyKz_Jnk844
When running python train.py, there is nothing in checkpoints, there are some results about accuracy and loss.
rBAoMF3nHZKADyySAAAyq3kTUgw708
2.
I tried:
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
or
CUDA_VISIBLE_DEVICES=0 ( The same way with your train.sh)
but my GPU isn't running.
rBAoL13nHkaAaK0kAACsdqqJJeM958

Have you ever come across this kind of problem? Look forward to your reply!

关于评价指标的计算

如 Douban corpus 的一些情况
情况:10个候选中没有一个是正类的。 在你的代码中 recall 10@1是不加入去统计的,是这样子吗?

Batch Size impact on model performance

Hi @JasonForJoy can you please confirm if reducing the batch size (due to low-end GPU machine availability) can impact the performance values of the model ?

Secondly, every batch_size including the recommended (i.e. 96) is not a divisor of the total number of training, valid or test samples. As a result of which the last batch misses out on samples. Does this process contribute in lowering the recall and other metric values ?

a little lower results on MAP and MRR

Hi,
Thanks for you open code. I tried to reproduce your result in the paper using the code with all parameters unchanged, but I got lower result about 2~3 lower on MAP and MRR with Douban Corpus. Could you tell that after how many epoches that you got the best result and the time it took?

Best wishes,
Sjzwind

Hardware specifications for IMN model

Hello team @JasonForJoy,
Is there any document that lists the hardware requirements of the model, i.e. the minimum specs of your computer system to run the model using original UDC (900k training dialogs) on the default training parameters (i.e. 1000000 epochs and 128 batch_size with evaluation every 1000 step) ?

I have tried running it with a colab Pro account with a premium GPU A100 and high RAM enabled, using only a reduced dataset (around 10000 train dialogs) for just 4 epochs, and it took hell lot of time like around 15+ hrs just to just 4 epochs on this setting!

Am I missing something here that you guys can help me speed up the training time ??

Looking for fast and sincere help here.
Regards.

请教几个有关豆瓣数据集的问题

@JasonForJoy 不好意思打搅了,有幸阅读了您的文章,非常棒!有几个问题想请教一下,
1:豆瓣数据集训练集标注只有一个正例和一个反例
image
麻烦问下这个正例和反例是怎么挑选的呢?有什么规则吗?
2:豆瓣数据集测试集test.txt中
image
这里正例和反例的数量加起来是10个,这是粗筛之后的结果是吗?麻烦问下有什么规则筛的吗?还有就是正例的数量有可能是NA,这是什么原因呢?
3:数据集的筛选规则如果有相关代码的话是否能发我一份,非常感谢!!!我的邮箱[email protected]

model Hyperparameters

Hello Mr. Jia Chen,
I wanted to clear a confusion to be able to replicate the model results AS IS.

In your paper entitled 'Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots', the embedding dimension used is 200 on the original UDC with drop_out probability = 0.2 whereas in the 'train.sh' script at the path 'https://github.com/JasonForJoy/IMN/blob/master/Ubuntu_V1/scripts/ubuntu_train.sh' the embedding_dim used is 400 and a drop_out prob = 0.8.

Which one should be used to replicate the published results ?

Your response shall be much appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.