Git Product home page Git Product logo

copypaste_datasetgenerator's Introduction

basis basis arxiv

Synthetic Dataset Generation

Create copy/paste synthetic images for object detection and instance segmentation.

This repository is a modified (and extended) version of debidatta/syndata-generation and a-nau/synthetic-dataset-generation The augmentation code is not changed. The code was made more modular All credits to the original authors (also see Citation).

example

Set-up

Clone the repository

git clone https://github.com/JureHudoklin/CopyPaste_DatasetGenerator.git

Create a virtual environment (optional) and activate it

virtualenv -p python3.8 copy_paste_env
source copy_paste_env/bin/activate

Install python dependencies

pip install -r requirements.txt

Configuration

Image Annotation Files

To generate a dataset two json annotation files are required.

  1. A json file containing the annotations of the objects
  2. A json file containing the annotations of the background images
  • Objects json files should have the following information: List of objects:

    [
      {
        "name": "example_cat", # name-of-the-category
        "supercategory": "example_super_cat", #name-of-the-super-category
        "file_name": "my_img_0.jpg", #name-of-the-image-file
        "img_path": "home/my_home/images/my_img_0.jpg", #absolute-path-to-the-image-file
      },
      ...
    ]
  • Background json files should have the following information: List of background images:

    [
      {
        "file_name": "my_img_0.jpg", #name-of-the-image-file
        "img_path": "home/my_home/images/my_img_0.jpg", #absolute-path-to-the-image-file
      },
      ...
    ]

Configuration File

You can configure the dataset generation by editing the config.py file inside the configs folder. All options are explained in the config.py file. MAKE SURE TO SET PATHS TO ANNOTATION FILES AND OUTPUT FOLDER INSIDE THE CONFIG FILE.

If you provide a load_path when creating a new dataset, the config file will be loaded from the provided path.

from configs.config import Config

cfg = Config(load_path = "path/to/config/file/filename.json"")

You can also save a config by running the following command:

from configs.config import Config

cfg = Config()
config.save(path)

Run

To generate a dataset you can run generate_dataset.py:

python3 generate_dataset.py

Citation

If you use this code for scientific research, please consider citing the following two works on which this repository is based on.

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

The original work, including the code on which this repository is built. Thanks a lot to the authors for providing their code!

@InProceedings{Dwibedi_2017_ICCV,
author = {Dwibedi, Debidatta and Misra, Ishan and Hebert, Martial},
title = {Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}

Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics

Our work for which this repository was developed.

@inproceedings{naumannScrapeCutPasteLearn2022,
  title = {Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics},
  booktitle = {{{IEEE Conference}} on {{Machine Learning}} and Applications} ({{ICMLA}})},
  author = {Naumann, Alexander and Hertlein, Felix and Zhou, Benchun and Dörr, Laura and Furmans, Kai},
  date = {2022},
}

copypaste_datasetgenerator's People

Contributors

jurehudoklin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.