Git Product home page Git Product logo

qa-deep-learning's People

Contributors

white127 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qa-deep-learning's Issues

复现结果较低

哥们您好,我也是做nlp的。
自己尝试复现了一下,发现效果不好,我的结构就是q,a都过lstm(两边共享的参数),然后maxpooling得到向量,cos之后triplet loss,但是只跑到了0。5,而且跑得非常慢,我一个q采样了100个negative a,想问一下啊您的模型快不快呢?我大约要一天才能收敛,参数都是我从别人论文里面找来的。。。

insurance_qa_data_helper.py中build_vocab()函数不能理解

第14行
def build_vocab():
code = int(0)
vocab = {}
vocab['UNKNOWN'] = code
code += 1
for line in open('/export/jw/cnn/insuranceQA/train'):
items = line.strip().split(' ')
for i in range(2, 3)://就是这一行,为什么不是range(2, 4)
words = items[i].split('_')
for word in words:
if not word in vocab:
vocab[word] = code
code += 1

关于InsuranceQA训练语料转,负样本采样

代码中训练数据的获取接口是: utils.gen_train_batch_qpn(train_data, FLAGS.batch_size)
但是在该函数中
def gen_train_batch_qpn(_data, batch_size):
psample = random.sample(_data, batch_size)
nsample = random.sample(_data, batch_size)
q = [s1 for s1, s2 in psample]
qp = [s2 for s1, s2 in psample]
qn = [s2 for s1, s2 in nsample]
return np.array(q), np.array(qp), np.array(qn)
psample和nsample获取方式一样??

模型问题

作者你好,我把模型训练好之后,调用保存好的模型,打印参数提示不存在,
ValueError: Fetch argument 'W:0' cannot be interpreted as a Tensor. ("The name 'W:0' refers to a Tensor which does not exist. The operation, 'W', does not exist in the graph.")

使用tensorboard进行可视化提示:
No dashboards are active for the current data set.
Probable causes:

You haven’t written any data to your event files.
TensorBoard can’t find your event files.
能帮忙看一下吗

论文

你好,请问下这是参考哪篇论文的?谢谢。

Share full or more data

Dear Authors,

I am happy to see your results on the insurance data set and tempted to re-produce on my side. But, I could not replicate on my test data. The reasons are that you applied stop-word removal, text normalization and lemmatization on the text. So, my test data and your test data are not matching. If possible, could you please share full test data or 500 or 1000 test queries similar to what you have provided for (20 test queries).

--Veera.

如何预测

这个实际使用怎么预测?如果候选集的答案很多计算会很慢吧

tensorflow和theano的cnn代码准确率都轻松超过了80%。。。

我用的作者原封不动的代码,数据是从这里拿的https://github.com/codekansas/insurance_qa_python,改成和作者一样的格式,跑起来以后发现用train拿来训练,用test1拿来validate。

无论是在作者tensorflow还是theano的代码上top-1 accuracy都轻松达到0.86,learning rate为0.1,epoch大概6000, 在K80上训练时间都不超过半小时。

据我所知,该项目的state-of-art不超过0.7,作者的代码简直轻松完虐。我仔细检查了代码,并没有发现明显的错误,有小伙伴跑出一样的结果吗?

运行的代码在这(仅仅清理了一下代码放上了train和test data,参数结构和原作者的完全一致) https://github.com/pcgreat/insuranceQA-cnn-lstm/tree/master/cnn/tensorflow
方便大家重现结果

TypeError: ('An update must have the same type as the original shared variable

python insqa_lstm.py
insqa_lstm.py:279: UserWarning: DEPRECATION: the 'ds' parameter is not going to exist anymore as it is going to be replaced by the parameter 'ws'.
pooled_out = pool.pool_2d(input=conv_out, ds=(sequence_len - filter_size + 1, 1), ignore_border=True, mode='max')
Traceback (most recent call last):
File "insqa_lstm.py", line 428, in
train()
File "insqa_lstm.py", line 399, in train
x1: p1, x2: p2, x3: p3, m1: q1, m2: q2, m3: q3
File "/Users/timothy/anaconda/envs/tensorflow/lib/python2.7/site-packages/theano/compile/function.py", line 317, in function
output_keys=output_keys)
File "/Users/timothy/anaconda/envs/tensorflow/lib/python2.7/site-packages/theano/compile/pfunc.py", line 449, in pfunc
no_default_updates=no_default_updates)
File "/Users/timothy/anaconda/envs/tensorflow/lib/python2.7/site-packages/theano/compile/pfunc.py", line 208, in rebuild_collect_shared
raise TypeError(err_msg, err_sug)
TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')

请问样本数据在哪能看到

I converted original idx_xx format to real-word format (see ./insuranceQA/train ./insuranceQA/test1.sample)

这个样本数据在哪能看到,或者说这个样本数据的格式是什么样子的呢

关于cnn网络结构疑惑

image

网络中第二层conv2d的输入应该是outputs_1[0]吧,而不是每次都是input_x1。代码是theano_cnn版本

你好,请教关于batch_size的问题

现在 cnn 和 rnn 的 数据输入都是一个batch_size,一个batch_size的。但是有个问题,所有数据的最后一个batch可能已经不足一个batch_size的大小了。怎么办呢??? 如果是matconvnet,最后一个batch 可以大小不如batch_size的。我看tutorial的处理是,最后一个就不处理了。那测试时候呢,也不处理了?tutorial给的样例不是很好。可能我对于tensorflow读的代码比较少,尤其在lstm方面,需要预定batch大小,state_init_R = tf.tile(init_R, [batch_size, 1]) ,这里必须要指定batch_size的大小。我问了下,theano这方面是比较灵活的。我看你既用了theano,也用了tensorflow。应该了解的比较深入。这个问题困扰我一段时间了,没有找到比较好的办法,请问你怎么看呢?谢谢!

train采样

训练模型的时候,train中的数据,回答都是正确答案吧,都是qp,没有qn吧,我理解的有问题吗?

new model

模型中只用了一层卷积,filter是(1,2,3,5)* embedding_size的,如果构建一般图像处理的模型呢,filter是个小滑窗(如5*5),多次conv+max_pool,这样的模型会有什么问题吗

TypeError: __init__() got an unexpected keyword argument 'input_shape'

rzai@rzai00:/prj/insuranceQA-cnn-lstm/lstm_cnn/theano$ python insqa_lstm.py
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled)
Traceback (most recent call last):
File "insqa_lstm.py", line 433, in
train()
File "insqa_lstm.py", line 387, in train
num_filters=num_filters)
File "insqa_lstm.py", line 259, in init
cnn1 = self._cnn_net(tparams, cnn_input1, batch_size, sequence_len, num_filters, filter_sizes, proj_size)
File "insqa_lstm.py", line 283, in _cnn_net
conv_out = conv2d(input=cnn_input, filters=W, filter_shape=filter_shape, input_shape=image_shape)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 149, in conv2d
imshp=imshp, kshp=kshp, nkern=nkern, bsize=bsize, **kargs)
TypeError: init() got an unexpected keyword argument 'input_shape'
rzai@rzai00:
/prj/insuranceQA-cnn-lstm/lstm_cnn/theano$

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.