Git Product home page Git Product logo

marco2929 / stabledgroundingsam Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 0.0 519 KB

Synthetic dataset generation with Stable Diffusion and generating of segmentation mask using Grounding DINO and Segment Anythin Model

License: Apache License 2.0

Python 100.00%
auto-labeling computer-vision deep-learning grounding-dino image-annotation image-classification instance-segmentation labeling-tool machine-learning multimodal object-detection pytorch segment-anything stable-diffusion supervision synthetic-dataset-generation yolov5 yolov8

stabledgroundingsam's Introduction

StabledGroundingSAM

This package takes an image and a text file containing the names of objects of interest and generates new images using stable diffusion. In the next step, grounding Dino detects the objects in the images and draws a bounding box around them. This is then the input to segment anything, which generates a segmentation mask. The generated dataset is saved in the yolo format.

๐Ÿ’ป Install

Pip install the requirements in a 3.11>=Python>=3.7 environment.

pip install -r requirements.txt

๐ŸŒ€ Results

Input prompt: apple

Input image:

drawing

Stable Diffusion image: Segmented image:

drawing drawing

๐Ÿ”ฅ Quickstart

Clone the StabledGroundingSAM repository from GitHub.

git clone https://github.com/Marco2929/StabledGroundingSAM.git

Change the current directory to the StabledGroundingSAM folder.

cd StabledGroundingSAM/

Clone the GroundingDINO repository from GitHub and follow the instructions there.

git clone https://github.com/IDEA-Research/GroundingDINO.git

Download the weights of segment anything and groundingDINO (If not already done) and move them in the weights folder

mkdir weights
cd weights
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget -q https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

Add a classes.txt file which contains the objects you want to label. Run program with providing the location of the classes.txt file, the initial image and how many pictures the model should generate.

python -m main <class.txt location> <initial image location> <number of pictures>

๐Ÿ”ง Adjustables

There are many optional arguments which make it possible to adjust the output of the models.

Stable Diffusion:

--diffusion_prompt: Provide different prompt to stable diffusion (by default classes.txt)

See stable diffusion documentation

--guidance_scale

--strength

Grounding-Dino:

See Grounding-Dino documentation

--box_threshold

--text_threshold

Segment Anything (only available in the yolo dataset):

Mask will not be used if it's over or under this percentage, useful to finetune segmentation.

--min_image_area_percentage

--max_image_area_percentage

--approximation_percentage: Changes the sharpness of mask.

stabledgroundingsam's People

Contributors

marco2929 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.