Git Product home page Git Product logo

creative-writing-with-gpt2's Introduction

Creative writing with GPT-2

Quickly get started with a notebook on Google Colab.

One of 2019's most important machine learning stories is the progress of using transfer learning on massive language models.

I have been experimenting with retraining GPT-2 on authors we like, and using the model as a writing partner. The process has been enlightening, and points towards a future where human and machine can write creatively together.

You can see examples of text generation from some of the finetuned models here.

This library wraps around the excellent Hugging Face Transformers library. Two of the scripts have been copied into this repo - run_generation and run_lm_finetuning, both of which can be found here.

How to write creatively with GPT-2

GPT-2 is not ready to write text on it's own - but with a bit of human supervision you can use the text it generates to write interesting text!

GPT-2 was originally trained on 40 GB of text from Wikipedia & news articles. This library can be used to generate text with the base GPT-2 model and to fine tune the base GPT-2 model to text of your choosing.

The library has a number of datasets in creative-writing-with-gpt2/data. A dataset is defined as a text file called clean.txt - for example asimov/clean.txt.

$ tree -L 1 creative-writing-with-gpt2/data
creative-writing-with-gpt2/data
├── alan-watts
├── asimov
├── bible
├── harry
├── hemingway
├── mahabarta
├── meditations
├── plato
└── tolkien

A number of pre-fine-tuned models are available in creative-writing-with-gpt2/models.py - you can download them to your machine by running python models.py.

Run on Colab

The recommended way to interact with this repo is through this Google Colab notebook - the free GPU is useful for fine-tuning.

Run locally

git clone https://github.com/ADGEfficiency/creative-writing-with-gpt2
cd creative-writing-with-gpt2
pip install -r requirements.txt
python models.py

To run the text generation with fine-tuned model (either downloaded from running python gdrive_models.py or from training yourself.

python run_generation.py \
  --model_type=gpt2 \
  --model_name_or_path="./models/tolkien" \
  --length=200
python run_lm_finetuning.py \
  --output_dir="./models/harry" \
  --model_type=gpt2 \
  --model_name_or_path=gpt2 \
  --do_train \
  --train_data_file="./data/harry/clean.txt" \
  --num_train_epochs=4 \
  --overwrite_output_dir \
  --save_steps 10000

To run the text generation with the base GPT2 model:

python run_generation.py \
  --model_type=gpt2 \
  --model_name_or_path="models/gpt2" \
  --length=200

Further reading

Allen Institute for Artificial Intelligence GPT-2 Explorer

huggingface/transformers

The Illustrated GPT-2 - Visualizing Transformer Language Models

The State of Transfer Learning in NLP

creative-writing-with-gpt2's People

Contributors

adgefficiency avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.