Git Product home page Git Product logo

Comments (12)

lllyasviel avatar lllyasviel commented on July 28, 2024 2

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala".
Generate a perfect image may take many hours on a 4090

from controlnet-v1-1-nightly.

lllyasviel avatar lllyasviel commented on July 28, 2024

Just Use Automatic 1111

Below results are all default parameters and the same simple prompts shown in my screenshot. A1111 is just magic.
image
image
image

from controlnet-v1-1-nightly.

lllyasviel avatar lllyasviel commented on July 28, 2024

Edit: Frequently asked questions are edited and pinned to help more people.
Edit2: Closed since solution found. Edited title restored.

from controlnet-v1-1-nightly.

xarthurx avatar xarthurx commented on July 28, 2024

@lllyasviel
First, really thank you for your time about this topic.

For the image you generated, I'd like to provide an architectural perspective:

As we're professionals, we evaluate the quality of the specific architecture seriously (geometry, space quality, etc.), and not based on the "general feeling" or the "style" of the image.

So if you look at the facade in the image, you'll see that the mullions and windows are in strange shape. We've experienced a lot in this effect and cannot overcome it completely with training dreambooth or lora. -- That's why we're here, and would like to seek advice from you to see of ControlNET can help.

image

from controlnet-v1-1-nightly.

lllyasviel avatar lllyasviel commented on July 28, 2024

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges
image
(and you can try m**j*****y and compare which solution is better)
(and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

from controlnet-v1-1-nightly.

xarthurx avatar xarthurx commented on July 28, 2024

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges image (and you can try mj***y and compare which solution is better) (and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

Really helpful input!

  1. We turned to SD+ControlNet from MJ becaused we need to control the geometry more strictly in the later part of the design process, so MJ is not an option for non-conceptial design.
  2. The somewhat results help to some extent (YES, we're indeed using a1111), but not fully resolve the problem (it may by burning the GPU very hard). It seems my naive proposal of trainig a cnet was not a good idea to you. Theoretically, do you think there's a possibility, though doesn't have to be a quick / user-end solution, to resolve the issue?

from controlnet-v1-1-nightly.

lllyasviel avatar lllyasviel commented on July 28, 2024

unfortunaly, it seems at that resolution, webui's gradio HTML crashes before controlnet fail. Good news is that controlnet is still working at that scale. bad news is that your browser does not support it. perhaps try firefox

from controlnet-v1-1-nightly.

xarthurx avatar xarthurx commented on July 28, 2024

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala". Generate a perfect image may take many hours on a 4090

unfortunaly, it seems at that resolution, webui's gradio HTML crashes before controlnet fail. Good news is that controlnet is still working at that scale. bad news is that your browser does not support it. perhaps try firefox

This is definitely a “theoretical” solution (though different from what I expected), but I kind of understand how the "tile" works unexpectedly. 🤣

I guess then for practical use (need ~2k resolution in < 5min), this is still an "unresolved" problem...
As I originally and incorrectly assume this can be fixed by a special type of cnet, it seems I need to wait for a more "vector-based" style plugin to control for such things...

But anyway, thank you for your time and input. Really appreciate it.

from controlnet-v1-1-nightly.

xarthurx avatar xarthurx commented on July 28, 2024

it seems if we just consider these examples, the best solution is to use scripts to progressively upscale it with tile, until each window in those buildings have a 512x512 resolution, I estimated it and the resolution needed to solve this image is about 52,428*39,322. We do not need to change the prompt; can always use "beautiful city with buildings, 4k, 8k, balabalabala". Generate a perfect image may take many hours on a 4090

It just came to my mind after posting the above post that, we actually use a region-based script to "upscale and then downscale" the area of the facade?

  • select the area need to be fixed
  • upscale with "tile" until the result is satisfied
  • downscale to the initial size

This save GPU time and probably can save the browser, too?

from controlnet-v1-1-nightly.

lllyasviel avatar lllyasviel commented on July 28, 2024

LDM learn specific patterns in specific conv layer levels - if you want to get the learned pattern to draw something like a window on a wall, you need to give a 512x512 space to occupy that thing so that the specific patterns learned in corresponding conv layer can be triggered. so you cannot downscale it, unfortunately
But perhaps can try only slicing the tiles along with mlsd lines to save computation power.
But we already begin to burn gpu, then perhaps just burn it without unnecessary mercy

from controlnet-v1-1-nightly.

xiaohaipeng avatar xiaohaipeng commented on July 28, 2024

@lllyasviel >

oh,god,this pic perfect ,has great details,with controlnet tile model,how do you set params in detail?

from controlnet-v1-1-nightly.

daizhuo avatar daizhuo commented on July 28, 2024

u can somewhat solve these, to some extent, using cnet 1.1 tile (v11f1e) but this is again another a1111-only feature and requires learning some a1111 knowledges image (and you can try mj***y and compare which solution is better) (and if you want to burn ur gpu, u can try running this image in tile again. tile is almost infinite for images with buildings like this. but this will really burn the gpu)

How do you make this?
Could you provide a detailed processing?
This processing is very import for architecture design.
I tried with no luck!
Thank you so much!

from controlnet-v1-1-nightly.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.