Git Product home page Git Product logo

generative-text-winson's Introduction

Project 1 Generative Text

Winson Luk, [email protected]

Abstract

Every startup claims to be disrupting an industry or changing the world. Most startup ideas are destined to fail, but some truly change the world. By training on thousands of startup taglines, articles, and interviews, this project aims to generate a lot of bad startup ideas, and a few good ones.

Most startup ideas can be summarized in just one paragraph. The tagline describes the overarching concept (e.g., "Uber is finding you better ways to move, work, and succeed"), and the next few sentences can provide a more detailed description of the product, as well as context on the startup's history, people, and industry.

The tagline can be created by finetuning GPT-2 with a dataset of startup taglines (https://github.com/winsonluk/gpt_pitches), and the subsequent sentences can be generated by feeding this tagline as a prefix into a other GPT-2 models finetuned with startup descriptions and company analyses (https://github.com/winsonluk/gpt_descriptions and https://github.com/winsonluk/gpt_summaries).

The ideas generated have been fairly realistic (most are bad, some are good), so there are plans to incorporate these results into a faux startup website similar to https://tiffzhang.com/startup, with a few million permutations of ideas. The value of these ideas depend solely on the reader's interpretation (see reader-response theory), but hopefully some of these ideas are cohesive enough to serve as inspiration.

Model/Data

Code

Results

Technical Notes

  • The multi-gpu fork of gpt-2-simple needs to be installed to train with the 774M model.
  • I used 4 x Tesla V100 GPUs and 16 GB of RAM on Vast.ai to train the models. Training will fail with single GPUs or less than 16 GB of RAM. After training, generation can be performed with a single GPU, though 16 GB of RAM is still necessary.
  • The startup tagline and description models are finetuned to a loss of around 0.1, while the larger TechCrunch model is finetuned to a loss of 1.8.
  • I sampled all models with temperature ranges from 0.2 to 2.0 and top-p from 0.1 to 1.0 (higher values translate to more "creativity" in the text) to find the optimal parameters for realistic text generation.

Examples

  • 1
  • 2
  • 4
  • 5
  • 7
  • 8
  • 9
  • 10
  • 11

Bloopers

Lowering unemployment

Lowering unemployment

Strategic arms sales

Strategic arms sales

Internet of things

Internet of things

Workers of the world, unite!

Workers of the world, unite!

10,000 hours

10,000 hours

Reference

References to any papers, techniques, repositories you used:

generative-text-winson's People

Contributors

roberttwomey avatar winsonluk avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.