Git Product home page Git Product logo

doda's Introduction

DODA

Official implementation of Diffusion for Object-detection Domain Adaptation in Agriculture

DODA is a data synthesizer that can generate high-quality object detection data for new domains in agriculture, and help the detectors adapt to the new domains.

overview of DODA

Pretrained Models

Model Dataset Resolution Training Iters Downlad Link
DODA-L2I COCO 512x512 30K Google drive
DODA-L2I COCO 256x256 100K Google drive
VAE GWHD2021 256x256 170K Google drive
DODA GWHD2021 256x256 80K Google drive
DODA-ldm GWHD2021 256x256 315K Google drive

Evaluation

Setup Environment

conda create -y -n DODA python=3.8.5
conda activate DODA
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -r requirements.txt

Download Datesets

bash Download_dataset.sh

Prepare Datesets

python prepare_coco.py
python prepare_wheat_trainset.py   # If you only want to test the model`s performance on GWHD, there is no need to run this line
python prepare_Terraref_testset.py

Generate Images for Evaluation

Generate images according to the bounding boxes of the COCO 2017 validation set: First download the pretrained DODA-L2I to /models folder, then run:

python generate_coco_testimg.py

Generate images according to the bounding boxes and reference images of the Terraref domain: First download the pretrained DODA to /models folder, then run:

python prepare_Terraref_testset.py

If you want to generate data to train the detector, first generate layout images using random_generate_layout_images.py, then use generate_data_for_target_domain.py to generate the data. If you want to generate data for your own domain, please refer to generate_data_for_target_domain.py

Generate images in GUI

You can try our method to generate images for wheat through the GUI:

python wheat_gradio_box2image.py

Please upload BOTH the reference image and layout image image respectively as shown:

web_example

PS: The demo reference image and layout image can be found in the /figures folder. More images can be found in /dataset folder after run prepare_wheat_trainset.py

Or you can simply draw it yourself through drawing software. Each item should have a distinguishable color (with maximized values of the R, G, B channels), for example, (0, 0, 255), (255, 0, 255), etc. Below are some examples of possible layout images:

layout_example

Train your own DODA

DODA training is divided into three parts, from first to last: VAE, LDM and controlnet. This repository reads the data set through a txt file, so first, please write the file names of all the images in your own dataset into a txt file.

Training of VAE

Modify the config in train_wheat.py :

config = 'configs/autoencoder/DODA_wheat_autoencoder_kl_64x64x3.yaml'

Modify the txt_file and data_root in the config file to the path of the filenames txt file and the path to your own dataset.

then train the VAE by running:

python train_wheat.py

VAE is very robust, so we recommend skipping VAE training and using the pre-trained weight kl-f4-wheat.ckpt we provide.

Training of ldm

Modify the config in train_wheat.py :

config = 'configs/latent-diffusion/DODA_wheat_ldm_kl_4.yaml'

Modify the ckpt_path in the config file DODA_wheat_ldm_kl_4.yaml to the weight path of your VAE or the VAE provided by us.

Modify the txt_file and data_root in the config file to the path of the filenames txt file and the path to your own dataset.

then train the ldm by running:

python train_wheat.py

Training of cldm

Modify the input_path in tool_add_control.py to the weight path of your ldm or the ldm provided by us, and modify output_path to specify the name of the output weight. Run tool_add_control.py to add the ControlNet to the ldm:

python tool_add_wheat_control.py

Modify the resume_path in train_wheat.py to the path of the output weight.

Modify the config in train_wheat.py :

config = 'configs/controlnet/DODA_wheat_cldm_kl_4.yaml'

Modify the txt_file and data_root in the config file to the path of the filenames txt file and the path to your own dataset.

then train the cldm by running:

python train_wheat.py

Hyperparameters for training

Hyperparameters

Training tips

Diffusion model is data hungry, and using more data always gives better results, so we strongly recommend mixing your data with GWHD for training. Mixing data can be achieved by putting all the images in your own dataset and the GWHD into one folder and writing the filenames of all images to one txt file.

doda's People

Contributors

illrayy avatar howcanoewang avatar

Stargazers

 avatar Stanislav Fateev avatar  avatar  avatar Anatoli Eckert avatar Jeff Carpenter avatar Karol Cyganik avatar Emmanuel Benazera avatar Nicolas Winckler avatar Scott Laue avatar  avatar

Watchers

 avatar Kostas Georgiou avatar  avatar

doda's Issues

train

Why does this error occur when using the ldm-trained model as the pre-trained weights for cldm training? What should i just do

ldm model question

Hello author, I am amazed by your work. When I trained the generative model, I used the LDM method to train 12,000 416×416 images without adding pre-trained weights. When generating images, why are all the generated images noisy?

generate images

a_a_120
When I use the trained cldm to generate images, as shown in the figure above, this is not the effect I want. I expect the effect to be as shown below, but the image below is generated by the ldm model, which means that the effect has become worse after cldm.

a_a_143

VAE

hello!, I am amazed by your work. I now want to train my own VAE model, but I don’t have the training code. Can you provide the code for training VAE?

diffusion model training (stage 1) with target features

Hi,

thanks for the paper and code. I have a few questions.
Do you have the code to train the diffusion model with MAE features?
How long did it take to train this part? on what hardware?

Concerning the paper, you explain that conditionning on layout is better with controlnet than with cross-attention, but you show in figure 2 stage 2 an input to cross attention. What does that mean?

Thanks for your help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.