Git Product home page Git Product logo

editanything's Introduction

Edit Anything by Segment-Anything

HuggingFace space

This is an ongoing project aims to Edit and Generate Anything in an image, powered by Segment Anything, ControlNet, BLIP2, Stable Diffusion, etc.

Any forms of contribution and suggestion are very welcomed!

News🔥

2023/08/09 - Revise UI and code, fixed multiple known issues.

2023/07/25 - EditAnything is accepted by the ACM MM demo track.

2023/06/09 - Support cross-image region drag and merge, unleash creative fusion!

2023/05/24 - Support multiple high-quality character editing: clothes, haircut, colored contact lenses.

2023/05/22 - Support sketch to image by adjusting mask align strength in sketch2image.py!

2023/05/13 - Support interactive segmentation with click operation!

2023/05/11 - Support tile model for detail refinement!

2023/05/04 - New demos of Beauty/Handsome Edit/Generation is released!

2023/05/04 - ControlNet-based inpainting model on any lora model is supported now. EditAnything can operate on any base/lord models without the requirements of inpainting model.

More update logs.

2023/05/01 - Models V0.4 based on Stable Diffusion 1.5/2.1 are released. New models are trained with more data and iterations.Model Zoo

2023/04/20 - We support the Customized editing with DreamBooth.

2023/04/17 - We support the SAM mask to semantic segmentation mask.

2023/04/17 - We support different alignment degrees bettween edited parts and the SAM mask, check it out on DEMO!

2023/04/15 - Gradio demo on Huggingface is released!

2023/04/14 - New model trained with LAION dataset is released.

2023/04/13 - Support pretrained model auto downloading and gradio in sam2image.py.

2023/04/12 - An initial version of text-guided edit-anything is in sam2groundingdino_edit.py(object-level) and sam2vlpart_edit.py(part-level).

2023/04/10 - An initial version of edit-anything is in sam2edit.py.

2023/04/10 - We transfer the pretrained model into diffusers style, the pretrained model is auto loaded when using sam2image_diffuser.py. Now you can combine our pretrained model with different base models easily!

2023/04/09 - We released a pretrained model of StableDiffusion based ControlNet that generate images conditioned by SAM segmentation.

Features

Try our HuggingFace DEMO🔥🔥🔥

Unleash creative fusion: Cross-image region drag and merge!🔥

image image

Clothes editing!🔥

image

Haircut editing!🔥

image

Colored contact lenses!🔥

image

Human replacement with tile refinement!🔥

image

Draw your Sketch and Generate your Image!🔥

prompt: "a paint of a tree in the ground with a river."

image image image
More demos.

prompt: "a paint, river, mountain, sun, cloud, beautiful field."

image image image

prompt: "a man, midsplit center parting hair, HD."

image image image

prompt: "a woman, long hair, detailed facial details, photorealistic, HD, beautiful face, solo, candle, brown hair, blue eye."

image image image

Also, you could use the generated image and sam model to refine your sketch definitely!

Generate/Edit your beauty!!!🔥🔥🔥

Edit Your beauty and Generate Your beauty

image image

Customized editing with layout alignment control.

image

EditAnything+DreamBooth: Train a customized DreamBooth Model with `tools/train_dreambooth_inpaint.py` and replace the base model in `sam2edit.py` with the trained model.

Image Editing with layout alignment control.

image

Keep the layout and Generate your season!

original paint SAM

Human Prompt: "A paint of spring/summer/autumn/winter field."

spring summer autumn winter

Edit Specific Thing by Text-Grounding and Segment-Anything

Editing by Text-guided Part Mask

Text Grounding: "dog head"

Human Prompt: "cute dog" p

More demos.

Text Grounding: "cat eye"

Human Prompt: "A cute small humanoid cat" p

Editing by Text-guided Object Mask

Text Grounding: "bench"

Human Prompt: "bench" p

Edit Anything by Segment-Anything

Human Prompt: "esplendent sunset sky, red brick wall" p

More demos.

Human Prompt: "chairs by the lake, sunny day, spring" p

Generate Anything by Segment-Anything

BLIP2 Prompt: "a large white and red ferry" p (1:input image; 2: segmentation mask; 3-8: generated images.)

More demos.

BLIP2 Prompt: "a cloudy sky" p

BLIP2 Prompt: "a black drone flying in the blue sky" p

  1. The human prompt and BLIP2 generated prompt build the text instruction.
  2. The SAM model segment the input image to generate segmentation mask without category.
  3. The segmentation mask and text instruction guide the image generation.

Generate semantic labels for each SAM mask.

p

python sam2semantic.py

Highlight features:

  • Pretrained ControlNet with SAM mask as condition enables the image generation with fine-grained control.
  • category-unrelated SAM mask enables more forms of editing and generation.
  • BLIP2 text generation enables text guidance-free control.

Setup

Create a environment

    conda env create -f environment.yaml
    conda activate control

Install BLIP2 and SAM

Put these models in models folder.

# BLIP2 and SAM will be audo installed by running app.py
pip install git+https://github.com/huggingface/transformers.git

pip install git+https://github.com/facebookresearch/segment-anything.git

# For text-guided editing
pip install git+https://github.com/openai/CLIP.git

pip install git+https://github.com/facebookresearch/detectron2.git

pip install git+https://github.com/IDEA-Research/GroundingDINO.git

Download pretrained model

# Segment-anything ViT-H SAM model will be auto downloaded. 

# BLIP2 model will be auto downloaded.

# Part Grounding Swin-Base Model.
wget https://github.com/Cheems-Seminar/segment-anything-and-name-it/releases/download/v1.0/swinbase_part_0a0000.pth

# Grounding DINO Model.
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth

# Get pretrained model from huggingface. 
# No need to download this! But please install safetensors for reading the ckpt.

Run Demo

python app.py
# or
python editany.py
# or
python sam2image.py
# or
python sam2vlpart_edit.py
# or
python sam2groundingdino_edit.py

Model Zoo

Model Features Download Path
SAM Pretrained(v0-1) Good Nature Sense shgao/edit-anything-v0-1-1
LAION Pretrained(v0-3) Good Face shgao/edit-anything-v0-3
LAION Pretrained(v0-4) Support StableDiffusion 1.5/2.1, More training data and iterations, Good Face shgao/edit-anything-v0-4-sd15 shgao/edit-anything-v0-4-sd21

Training

  1. Generate training dataset with dataset_build.py.
  2. Transfer stable-diffusion model with tool_add_control_sd21.py.
  3. Train model with sam_train_sd21.py.

Acknowledgement

@InProceedings{gao2023editanything,
  author = {Gao, Shanghua and Lin, Zhijie and Xie, Xingyu and Zhou, Pan and Cheng, Ming-Ming and Yan, Shuicheng},
  title = {EditAnything: Empowering Unparalleled Flexibility in Image Editing and Generation},
  booktitle = {Proceedings of the 31st ACM International Conference on Multimedia, Demo track},
  year = {2023},
}

This project is based on:

Segment Anything, ControlNet, BLIP2, MDT, Stable Diffusion, Large-scale Unsupervised Semantic Segmentation, Grounded Segment Anything: From Objects to Parts, Grounded-Segment-Anything

Thanks for these amazing projects!

editanything's People

Contributors

fingerrec avatar gasvn avatar ikuinen avatar panzhous avatar skycol avatar xingyuxie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

editanything's Issues

Recreating Haircut Editing

The Haircut editing effect in the example is amazing.
I tried to reproduce the effect, but the results were not satisfactory. I could only change the color, but couldn't make long hair short or short hair long. Could you please provide guidance on how to reproduce the Haircut editing effect in the example? I would appreciate if you could provide the prompt and other parameters.
Thank you!

TypeError: forward() got an unexpected keyword argument 'guess_mode'

While running the 'editany.py' code and selecting points to segment my desired area, I encountered an error immediately after clicking the 'run' button. The error message reads as follows: "TypeError: forward() got an unexpected keyword argument 'guess_mode'."
could you please help me to address this issue

High memory usage after processing multiple images

I deployed this repository on my dev server, and started app.py, but I noticed that the process occupys much system RAM (~80GB) after processing 7 or 8 images, and the RAM is probably not freed after processing one image. Thanks for your help.

Reference mode for the cross-image drag and merge functionality

Thanks for the excellent work.

It looks like you use the reference mode recently released by ControlNet to achieve the cross-image drag and merge feature. Can you point to any resource/material on how this reference mode works in ControlNet? Also, can you point to the code in your repo that does this functionality?

Thanks for your time and help.

Gradio demo has an error

I ran "sam2edit.py". When I upload an image to run, an such error occured.

File "/project/EditAnything/sam2edit_lora.py", line 615, in process
x_samples_tile = self.tile_pipe(
File "/project/EditAnything/utils/stable_diffusion_controlnet_inpaint.py", line 1571, in call
down_block_res_samples, mid_block_res_sample = self.controlnet(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/diffusers/models/controlnet.py", line 526, in forward
sample, res_samples = downsample_block(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 867, in forward
hidden_states = attn(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/diffusers/models/transformer_2d.py", line 265, in forward
hidden_states = block(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/diffusers/models/attention.py", line 331, in forward
attn_output = self.attn2(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 267, in forward
return self.processor(
File "/root/miniconda3/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 689, in call
key = attn.to_k(encoder_hidden_states)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x1024 and 768x320)

Upload to replicate

Hey!

Lovely concept and results.

It would be great idea if you upload it to replicate.com so it is possible to do HTTP API inference using my custom UI!
What do you think?

Keep it up guys.
Moe

Dependencies in `requirements.txt` have module conflicts.

Background

Dependencies in requirements.txt have module conflicts.

Description

There are multiple dependencies mentioned in the requirements.txt file(the -> means the indirect dependencies):

opencv-contrib-python
basicsr->opencv-python
albumentations->opencv-python-headless
invisible-watermark->opencv-python
qudida->opencv-python-headless

The official spec mentioned that the opencv-python package is for the desktop environment, while opencv-python-headless is for the server environment. The documentation also states that these packages cannot be installed simultaneously (the exact wording is: “There are four different packages (see options 1, 2, 3, and 4 below) and you should SELECT ONLY ONE OF THEM.”). This is because they both use the same module name cv2.

During the installation process using pip, the package installed later will override the cv2 module from the previously installed package (specifically, the modules within the cv2 folders that exist in both packages). Furthermore, the dependency graph even includes different versions of these two packages. It is certain that the common files with the same path in these two packages contain different contents. Therefore, there may be functional implications when using them. However, without analyzing the specific code and function call hierarchy of this project, it can be stated that issues related to overwriting and module conflicts do exist.

Steps to Reproduce

pip install -r requirements.txt

Desired Change

Indeed, it is not an ideal behavior for modules to be overwritten, even if they are not actively used or if the overwritten module is the one being called. It introduces uncertainty and can cause issues in the long run, especially if there are changes or updates to the overwritten modules in future development. It is generally recommended to avoid such conflicts and ensure that only the necessary and compatible dependencies are declared in the requirements to maintain a stable and predictable environment for the project.

We believe that although this project can only modify direct dependencies and indirect dependencies are a black box, it is possible to add additional explanations rather than directly declaring both conflicting packages in the requirements.txt file. Or maybe you can check the dependencies and remove the redundant dependencies from the requirements.txt.

Adding extra explanations or documentation about the potential conflicts and the need to choose only one of the conflicting packages can help developers understand the issue and make informed decisions. Including a clear instruction or warning in the project’s documentation can guide users to choose the appropriate package based on their specific requirements.

How to use drag and merge

Hi, could you please provide more details how to use the drag and merge function, and maybe more code details, thanks!

Browser UI

HI,

Before I install it, I need to know if this has a browser UI? Or is it purely command line?

Thanks.

AttributeError: module 'keras.backend' has no attribute 'is_tensor'

Generating SAM seg:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1431, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1109, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 706, in wrapper
response = f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/content/drive/MyDrive/Colab Notebooks/EditAnything-main/EditAnything-main/annotator/util.py", line 86, in wrapper
result = func(self, *args, **kwargs)
File "/content/drive/MyDrive/Colab Notebooks/EditAnything-main/EditAnything-main/editany_lora.py", line 743, in process
control = einops.rearrange(control, "b h w c -> b c h w").clone()
File "/usr/local/lib/python3.10/dist-packages/einops/einops.py", line 424, in rearrange
return reduce(tensor, pattern, reduction='rearrange', **axes_lengths)
File "/usr/local/lib/python3.10/dist-packages/einops/einops.py", line 368, in reduce
return recipe.apply(tensor)
File "/usr/local/lib/python3.10/dist-packages/einops/einops.py", line 203, in apply
backend = get_backend(tensor)
File "/usr/local/lib/python3.10/dist-packages/einops/_backends.py", line 49, in get_backend
if backend.is_appropriate_type(tensor):
File "/usr/local/lib/python3.10/dist-packages/einops/_backends.py", line 513, in is_appropriate_type
return self.K.is_tensor(tensor) and self.K.is_keras_tensor(tensor)
AttributeError: module 'keras.backend' has no attribute 'is_tensor'

Details about training part

Hi, thanks for this great project. I have some questions about the training part.

  1. From data_build.py, we know that there is one txt file and a bunch of json files, I wonder if you can release an example of both txt and json files.
  2. Is the updated checkpoint model trained by current released training code? If not, would you mind to release the newest training code?
  3. If possible, can you complement more training details in README?

Training details

Hi,
I have some questions regarding the training details.

  1. Do you use EMA?
  2. How do you pick the best model, i.e., based on which metric?

How to use "Cross-Image region drag and merge"?

I am sorry if this is not the place to ask questions.

I launched the gradio through editany.py. But I did not figure out how to select a green region and drag merge another image, as shown in the Features area. I think it is really cool to drag and merge the outfit of the superman.

Would really appreciate it if the author or anybody else could help me out.

Convert your ControlNet model to A1111/lllyasviel format

I am the author of sd-webui-segment-anything. The only thing your work is not compatible with Mikubill ControlNet extension and my SAM extension is your ControlNet model format. Please convert your ControlNet model to lllyasviel format, similar to this (1 yaml + 1 model state dict, the state dict keys should be the same as other models here, the yaml filename should be the same as the model filename, except the file extension) and your work can soon be compatible with A1111 sd-webui and be accessible to the broad stable diffusion community.

Unable to reproduce the dog's head example when using the same example image

Hi there,

Thank you so much for your amazing work. I am trying to play the code with your example images. But it turns out that when I am running sam2groundingdino_edit.py, I use the white dog image, mask_prompt="dog head.", prompt="cute dog", which is exactly the same as the demo posted in the README. It seems that the algorithm is only able to segment the entire dog and edit. I tried several times but got the same segmentation results. What should I do if I only want to edit the dog's head? I didn't modify any piece of code.

Thank you so much for your help!

Screenshot from 2023-11-01 18-51-01
Screenshot from 2023-11-01 18-52-16

Edit high res?

Similar to high res inpainting in SD webui, would this be possible?

Multi-GPUs training/inference & Training of `StableDiffusionControlNetInpaintMixingPipeline`

Thanks for your great project!

May I know whether we can enable multi-gpu inference? And do you train your inpainting model in multi-gpu?

Could you also share an example training file that can accommodate to your class, StableDiffusionControlNetInpaintMixingPipeline?

Is the training pipeline similar to the one in the diffuser tutorial for training the controlnet https://huggingface.co/docs/diffusers/training/controlnet ?

Since I want to customize a model that can perform StableDiffusionControlNetInpaintMixingPipeline on a specific dataset, it would be much appreciated even you can upload a rough training file without modifying the code w.r.t. the LAION dataset, such that I can quickly get the main idea about the holistic structure of your training pipeline as the preprocessing part for the dataset should be much easier to be debugged from my end :)

Train on SD 1.5

Would it be possible to also release a controlnet model with SD 1.5 as the base model?

Model error, cannot find the diffusion_pytorch.bin

Executing the sam2edit.py and the two other files keep returning this error:

OSError: shgao/edit-anything-v0-1-1 does not appear to have a file named diffusion_pytorch_model.bin

Specifically, the error says:

EntryNotFoundError: 404 Client Error. (Request ID: Root=1-64366123-5a1da902152616ed1d113444)

Entry Not Found for url: https://huggingface.co/shgao/edit-anything-v0-1-1/resolve/main/diffusion_pytorch_model.bin.

Downloading the .ckpt file from the huggingface also doesn't seem to solve the error. Am I missing something here?

Discord?

Suggestion: a Discord server of EditAnything so we can have a community to plan and collaborate on projects?

Colors for SAM mask based ControlNet during training

Hi,

Thanks for the great repo. I was wondering how the label colors for the SAM mask based control model are decided. More specifically are the label colors decided based upon a closed vocabulary, or is it open vocabulary but with random colors?

Thanks!

Filenotfound error

Hello,

I found this repo yesterday and I thought to give it a try, but after I done the setup steps running app.py give me this error

image

please help, thanks

Train ControlNet - prompt for each segmented mask region?

Hi again! : )

is it possible to train with a prompt for each segmented mask region?

Ex:
Input image: house

Segmentation mask:

  • Orange: roof
  • Blue: Walls
  • Pink: window

Prompts:

  • Orange: orange.txt ("a roof with solar panels")
  • Blue: Walls blue.txt ("a wall made of concrete")
  • Pink: window pink.txt ("a blue wooden window")

That would open up a lot of possibilities!

why should generate the mask again?

when the first click, we can get a mask with green color, but when check the details of code, i found it generate a mask again, what's the reason?
edit_lora.py line775

            full_segmask, detected_map = self.get_sam_control(
                resize_image(input_image, detect_resolution)
            )

Using Dreambooth Model that wasn't trained with inpainting

model = EditAnythingLoraModel(base_model_path="nitrosocke/Ghibli-Diffusion",
                                  controlmodel_name='LAION Pretrained(v0-4)-SD15', extra_inpaint=True,
                                  lora_model_path=None, use_blip=True)

Hello when running the above code, I am getting the following error:

Traceback (most recent call last):
  File "/home/amankishore/.local/lib/python3.8/site-packages/gradio/routes.py", line 337, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/amankishore/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1015, in process_api
    result = await self.call_function(
  File "/home/amankishore/.local/lib/python3.8/site-packages/gradio/blocks.py", line 833, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/amankishore/.local/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/amankishore/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/amankishore/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/amankishore/mirage/deep-learning/EditAnything/sam2edit_lora.py", line 446, in process
    x_samples = self.pipe(
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/amankishore/mirage/deep-learning/EditAnything/utils/stable_diffusion_controlnet_inpaint.py", line 1086, in __call__
    down_block_res_samples, mid_block_res_sample = self.controlnet(
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_controlnet.py", line 125, in forward
    down_samples, mid_sample = controlnet(
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/diffusers/models/controlnet.py", line 526, in forward
    sample, res_samples = downsample_block(
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/diffusers/models/unet_2d_blocks.py", line 867, in forward
    hidden_states = attn(
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/diffusers/models/transformer_2d.py", line 265, in forward
    hidden_states = block(
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/diffusers/models/attention.py", line 331, in forward
    attn_output = self.attn2(
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/diffusers/models/attention_processor.py", line 267, in forward
    return self.processor(
  File "/home/amankishore/.local/lib/python3.8/site-packages/diffusers/models/attention_processor.py", line 689, in __call__
    key = attn.to_k(encoder_hidden_states)
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/amankishore/.local/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (240x768 and 1024x320)

Encode Mask for ControlNet

Hi,
I'm wondering how the masks are encoded as the input of ControlNet. If I understood correctly, the masks predicted by SAM are binary masks without class information. So did you randomly sample colors for each mask or?

Thanks a lot in advance!

Question: can I apply a partial segmentation control image?

Summary

I would like to achieve some masked inpainting, but want to limit the segmentation to only a part of the image, and left regions out of the segmentation mask without constraints.

Motivation

The motivation comes from the pains of regular inpainting - if I only supply a mask and a prompt without further constraints, the model will usually add unwanted features to the subject. See below example, the inpainted image grows extra legs on the dog.

Prompt: "dog on the beach, best quality"

Original image from segment-anything example Mask Inpainted image with controlnet-inpaint
Doggo mask_image 00003-2920368284

To prevent this, I am thinking of using a segment-mask control net to constrain the extent of the primary subject. I use the following script to assign the masked region as segment 1, and other region as segment 0, but it results in the model recognizing everything else uniformly. See produced image.

        def mask_to_control(mask):
            # mask is a binary mask after HWC3(...)
            res = np.zeros((mask.shape[0], mask.shape[1], 3))
            res[:, :, 0] = np.where(mask[:, :, 0] >= 128, 0, 1)
            return res

881af9747e4b4e6f81ee59032770e35e

I would like the image not be bounded by the segmentation mask outside of the masked region, is that possible?
I am still new with controlnet. Please also don't hesitate to point out any mistakes. Glad to hear from all of ya!

App.py run error

Hi I am using vscode for running the files that is what I got as error. I have conda environment in my vscode. I pinpointed "control" conda environment to my vscode project.

`PS C:\Users\Furkan\Desktop\3d\Diffusion\editanything\EditAnything> & C:/Users/Furkan/anaconda3/envs/control/python.exe c:/Users/Furkan/Desktop/3d/Diffusion/editanything/EditAnything/app.py
C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\numpy_init_.py:138: UserWarning: mkl-service package failed to import, therefore Intel(R) MKL initialization ensuring its correct out-of-the box operation under condition when Gnu OpenMP had already been loaded by Python process is not assured. Please install mkl-service package, see http://github.com/IntelPython/mkl-service
from . import distributor_init
Traceback (most recent call last):
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\numpy\core_init
.py", line 23, in
from . import multiarray
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\numpy\core\multiarray.py", line 10, in
from . import overrides
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\numpy\core\overrides.py", line 6, in
from numpy.core._multiarray_umath import (
ImportError: DLL load failed while importing _multiarray_umath: Path could not be found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:/Users/Furkan/Desktop/3d/Diffusion/editanything/EditAnything/app.py", line 1, in
import gradio as gr
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\gradio_init_.py", line 3, in
import gradio.components as components
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\gradio\components_init_.py", line 1, in
from gradio.components.annotated_image import AnnotatedImage
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\gradio\components\annotated_image.py", line 7, in
import numpy as np
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\numpy_init_.py", line 140, in
from . import core
File "C:\Users\Furkan\anaconda3\envs\control\lib\site-packages\numpy\core_init_.py", line 49, in
raise ImportError(msg)
ImportError:

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  • The Python version is: Python3.8 from "C:\Users\Furkan\anaconda3\envs\control\python.exe"
  • The NumPy version is: "1.23.1"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: DLL load failed while importing _multiarray_umath: Module cannot be found`

When I run it from cmd.exe I get this error. But even if I install the gradio on conda env it is still coming.

File "C:\Users\Furkan\Desktop\3d\Diffusion\editanything\EditAnything\app.py", line 1, in <module> import gradio as gr ModuleNotFoundError: No module named 'gradio'

Train ControlNet - custom resolution?

Hi, thanks for sharing this amazing repo! I was thinking of doing something similar, super impressive you related this just a couple days after SAM! 🚀

My dataset is 1024 x 768. Is it possible to train with a a custom?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.