Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations
- Download the original Spider dataset
- Generate text to clause dataset using SQL2NL/SQL2NL.py
- instruction = 'trainingData'.
- You can design your own explanation template within the method "parseSQL()".
- python SQL2NL.py and you will get the dataset under "dataset/structured/spider/train_spider.json".
- You could also download our raw text-to-clause dataset. Please put it in the same directory as the original Spider dataset and include all the databases (For more information, please refer to https://github.com/taoyds/spider)
- Paraphrase the text-to-clause dataset (optional)
- You could paraphrase the dataset by Quillbot with our automated script based on PyAutoGUI.
- Please check the script and all screenshots under here. These screenshots are used to position the cursor during the automation. Due to subtle resolution/theme/version differences, the screenshots may not be identified on your computer (even if a human can), you may need to take your screenshots on your computer and replace them manually.
- Our text-to-clause model is based on SmBoP, and you can strictly follow their environment and settings.
- You can directly download and reuse our check point (HuggingFace) as well as configuration file.
- Please replace the original configuration file with ours!
To help you understand the project logic, we also encapsulate most of the project folder. You can directly download it to check the configuration. Folder
This repository is currently being updated, and more details will be provided in the future. If you have any questions, please feel free to email [email protected]
Thanks!