Git Product home page Git Product logo

guigen's Introduction

Syntactically Guided Text Generation

This repo is associated with the paper Transformer-Based Neural Text Generation with Syntactic Guidance

Requirements

Training

Data Preparation

Download training and test data from here and copy the train and test folders into the data folder.

Train Syntax Expander

If your data folder are in this directory, you can directly run the script run_syn_train.sh:

./run_syn_train.sh

Otherwise, you need to specify --ori_dir, --ref_dir and --dict_dir parameters.

The trained model will be saved in the models folder in the name of model.<date>.best.synlvl.chkpt.

Train Text Generator

Again, if you have data folder in your current directory, you can directly run the script run_txt_gen_train.sh. Otherwise you need to specify --ori_dir, --ref_dir and --dict_dir parameters.

Inference

Generate Text with Ground Truth Target Parse

You need to edit the script run_txt_generate.sh before using it.

First, you need to substitute the <date> part in TXT_MODEL_PATH to the real value. Then, you may want to specify --bpe_model_path, --test_data_path and --dict_path if you do not have data folder in the current directory.

Note that if you train the text generator on one or zero GPU, you have to delete line 49 and 51 of the file ./TextGen/Generator.py

The generated text would be saved in the folder ./generations.

Expand Template parse and Generate Text with the Expanded Parse

The overall operation is the same as above. The script to run is run_txt_gen_from_tmpl.sh.

guigen's People

Contributors

yinghao-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

ruisi-su zihanok

guigen's Issues

Scripts to generate .PT data files

Hi,

I happened to read this paper a couple of weeks ago and found this idea really fascinating! I started trying out the code and I was wondering if you could please share the code to generate the .pt data (the URL mentioned in the README). For example, did you read separate files for text strings and syntax strings?

Thanks in advance!

Generate on a trained model

Hi, thanks for the continued support of this project! I extend a bit upon using a pre-trained embedding for the text pass, and I have trained a model successfully. However, upon generating the paraphrases, I am hitting an error:

Traceback (most recent call last):
  File "txt_generate.py", line 137, in <module>
    main()
  File "txt_generate.py", line 122, in main
    path_mask=path_mask_batch,
  File "/home/ubuntu/research/ruisi/GuiGen/TextGen/Generator.py", line 146, in inference
    dec_beam.forward(txt_prob=txt_prob)
  File "/home/ubuntu/research/ruisi/GuiGen/TextGen/Beam.py", line 73, in forward
    sorted_prev_word_seq = self._txt_seq[prev_beam_indices]
IndexError: tensors used as indices must be long, byte or bool tensors

I have modified the run_txt_generate.sh with the proper model name, and conditioned line 49 and 51 on cuda. Could you please help me figuring out what went wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.