Git Product home page Git Product logo

sketchdreamer's Introduction

SketchDreamer: Interactive Text-Augmented
Creative Sketch Ideation

PyTorch Conference Paper

Artificial Intelligence Generated Content (AIGC) has shown remarkable progress in generating realistic images. However, in this paper, we take a step "backward" and address AIGC for the most rudimentary visual modality of human sketches. Our objective is on the creative nature of sketches, and that creative sketching should take the form of an interactive process. We further enable text to drive the sketch ideation process, allowing creativity to be freely defined, while simultaneously tackling the challenge of "I can't sketch". We present a method to generate controlled sketches using a text-conditioned diffusion model trained on pixel representations of images. Our proposed approach, referred to as SketchDreamer, integrates a differentiable rasteriser of Bezier curves that optimises an initial input to distil abstract semantic knowledge from a pretrained diffusion model. We utilise Score Distillation Sampling to learn a sketch that aligns with a given caption, which importantly enable both text and sketch to interact with the ideation process. Our objective is to empower non-professional users to create sketches and, through a series of optimisation processes, transform a narrative into a storyboard by expanding the text prompt while making minor adjustments to the sketch input. Through this work, we hope to aspire the way we create visual content, democratise the creative process, and inspire further research in enhancing human creativity in AIGC.

overview

Citation

@inproceedings{qu2023sketchdreamer,
  title={SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation},
  author={Qu, Zhiyu and Xiang, Tao and Song, Yi-Zhe},
  booktitle={BMVC},
  year={2023}
}

Instructions

Dependencies

This repository is based on CLIPasso and Stable-Dreamfusion. We would like to thank the authors of these work for publicly releasing their code. Please follow the dependencies of CLIPasso to install DiffVG. After running CLIPasso successfully, please install the following libraries.

pip install --upgrade diffusers accelerate transformers

Script

sh run.sh

Parameters

The optimisation process is susceptible to the initialsation of Bezier curves. We support the following methods to set the positions of strokes.

Bounding box: follows the format "x1, y1, x2, y2". All parameters should be in the range 0 to 1.

# a square area with width and height of (0.4 * 512)
sh run.sh --bbox "0.3,0.3,0.7,0.7"

Coordinates: follows the format "x1,y1,x2,y2,...,xN,yN". All parameters should be in the range 0 to 1.

# 4 curves with the starting points located in 4 corners
sh run.sh --num_strokes 4 --init_point "0,0,0,1,1,0,1,1"

When the value of num_aug_clip is set to 1, the data augmentation will be canceled. Data augmentation will make each iteration significantly slower, but will produce better results.

sketchdreamer's People

Contributors

sketchx-qzy avatar winkawaks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

sketchdreamer's Issues

Code release for robotic demonstration

Hello,

Thanks for the great work ! It happens to be exactly what I am looking for.
Indeed, I would like to create an installation where a visitor give a text prompt and draw something on a board and then, a robotic arm complete the drawing.
Have you planned on releasing the code ? I would be verry happy if you do so (even an undocumented, uncleaned version) !
Of course I would credit you and the other authors.

Cheers

Can not get the results in the paper

Hello, I am trying to run your code in QuickDraw dataset, but I find the results look a little weird. Like the "a_simple_drawing_of_an_apple", there are some extra lines that don't match the apple, so if I should reduce the stroke numbers?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.