zichaow / qg-net Goto Github PK

View Code? Open in Web Editor NEW

49.0 49.0 24.0 65.35 MB

code for QG-Net: A Data-Driven Question Generation Model for Educational Content

License: MIT License

Shell 2.60% Python 97.40%

qg-net's People

Contributors

Stargazers

Watchers

Forkers

anjapago warent dhodge-gallup umesh-timalsina tomarraj008 yqpub gallupgovt ramsrigouthamg pouyapez peide xyf199919 poulain-tim homizoka kurianbenoy mafrasiabi placeholderorg sugarcorn a2un techthiyanes

qg-net's Issues

{RuntimeError}CuDNN error: CUDNN_STATUS_SUCCESS依赖版本报错

dependencies：ubuntu16.04+cuda9.0+cudnn7.1.2+python3.5+pytorch0.4.1+torchtext0.1.1+torchvision0.2.1
感觉可能是cuda和cudnn的版本和运行版本不同，请教一下运行的依赖版本是什么？

How to preprocess my own corpus?

Thanks for your great work!
I found that your model is trained on SQuAd dataset.
I was wondering could I train a model with my own dataset?
Thanks!

python3: symbol lookup error: /usr/local/python3/lib/python3.6/site-packages/torch/lib/libtorch_python.so: undefined symbol: PySlice_Unpack

我用的环境是python3.6，执行./qg_reproduce_LS.sh时报错
[root@VM_0_8_centos test]# ./qg_reproduce_LS.sh Loading model parameters. 0%| | 0/11 [00:00<?, ?it/s]/usr/local/python3/lib/python3.6/site-packages/torchtext/data/field.py:197: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():instead. return Variable(arr, volatile=not train), lengths /usr/local/python3/lib/python3.6/site-packages/torchtext/data/field.py:198: UserWarning: volatile was removed and now has no effect. Usewith torch.no_grad(): instead. return Variable(arr, volatile=not train) python3: symbol lookup error: /usr/local/python3/lib/python3.6/site-packages/torch/lib/libtorch_python.so: undefined symbol: PySlice_Unpack [root@VM_0_8_centos test]#

Connection refused

Hi, when I'm downloading the prepro_squad, I encountered with these questions. Could help me? Thanks

--2019-01-29 11:42:32-- https://rice.box.com/shared/static/6haddoiep15fmdqmtdp3ccf44o1m6e4z.gz
Resolving rice.box.com... 107.152.24.197
Connecting to rice.box.com|107.152.24.197|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://rice.app.box.com/shared/static/6haddoiep15fmdqmtdp3ccf44o1m6e4z.gz [following]
--2019-01-29 11:42:34-- https://rice.app.box.com/shared/static/6haddoiep15fmdqmtdp3ccf44o1m6e4z.gz
Resolving rice.app.box.com... 8.7.198.45
Connecting to rice.app.box.com|8.7.198.45|:443... failed: Connection refused.
preprocessed data download completed.

init() got an unexpected keyword argument 'tensor_type'

Hello. Thanks for sharing code.

I followed readme but this error occured.

File "/opt/share-vol/ochiroo/QG-Net/OpenNMT-py/onmt/io/IO.py", line 101, in get_fields
postprocessing=make_src, sequential=False)
TypeError: init() got an unexpected keyword argument 'tensor_type'

How can i fix this?

RuntimeError: view size is not compatible with input tensor's size and strid

Hi, it's really a great work!
However, I got the error message when I run the train.sh

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

How could I fix it?
Thank you a lot!

May I ask why the source vocab and target vocab are separated?

It seems a bit late but I just read your paper and tried to run the demo. Amazing work, thank you for sharing the repo.

I take a look at the model structure and find that encoder and decoder have their own embedding layers of different sizes, meaning the source and target vocabularies are separated. It makes sense if it's the default setting in OpenNMT since the source and target are in two languages in translation tasks. However, here we are dealing with the QG task where both of them are English sentences, similar to the summarization task.

In Abi's original implementation we can see that the embeddings are shared by encoder and decoder (https://github.com/abisee/pointer-generator/blob/master/model.py @ line 209).

I'm wondering how the copy mechanism works exactly here. If it's the case in the paper that the generation vocab for ith question is the union of the question vocabulary and the vocabulary of the ith input context, does that mean the purpose of using pointer network is not to deal with OOVs? because in the "extended generation vocab" for each question, you are adding words that are in the encoder's vocab but are not in the decoder's vocab, then they are not really OOVs for the entire model.

Second, in this setting, whenever there is a copy, that means the word is not in the question vocabulary, so there will be a mismatch between the reference question and generated question. To me, it seems that the loss will increase every time the copy mechanism works.

I'm not sure if I have a misunderstanding about your work. Please correct me if you find any. I will appreciate your reply.

Could you share the baseline code?

Hi, I noticed that you shared the checkpoint of two baseline.
Could you share the baseline code? because I'd like to reproduce the result for baseline with my own dataset, thanks !

missing download_baselines.sh

Hi,
Script download_baselines.sh mentioned in the documentation is missing. We are trying to test this. Can you please upload the missing files.
Thanks, -Muthu

Add licence

Hi, please add licence to your work. It would make it much more attractive for other contributors. Since this is very exciting work I myself would like to contribute to.

Access to Trained Modules and Data files

Hi, I found your paper to be very interesting. In my attempt at running this code, I realized that the preprocessed dataset and trained models cannot be downloaded. I am assuming they are not hosted anymore, but can this be verified? Also if it would be possible to re-upload them? Thank you.

How to preprocess input text?

I can run the test demo and get good results. But how can I correctly preprocess customized documents to generate questions? I found file preproc_squad.py, but the code is used to process training/test data. It's a little hard for me to extract the preprocessing code. Can you provide the input document preprocessing scripts? or simply describe the work procedure？