Comments (1)
All right, I tried to do this on my own, and came up with this dirty script:
# clipcolors.py
import modules.scripts as scripts
from modules import shared
from modules.processing import process_images
class Script(scripts.Script):
def title(self):
return "clipcolors"
def ui(self, is_img2img):
return []
def run(self, p):
clip = shared.sd_model.cond_stage_model
encoder = clip.wrapped.transformer.text_model.encoder
pos = True
h = encoder.forward
def H(*ar,**kw):
nonlocal pos
if pos:
pos = False
return h(*ar,**kw)
pos = True
inputs_embeds = kw['inputs_embeds']
E = inputs_embeds[0]
a = 0
b = 0
c = None
def G(f):
y = None
z = None
def F(X,*ar,**kw):
nonlocal a,b,c,y,z
R = f(X,*ar,**kw)
r = R[0][0]
x = X[0]
if c is None:
y = r.clone()
z = r.clone()
elif c:
r[:a,:] = y[:a,:]
r[b:,:] = y[b:,:]
z[a:b,:] = r[a:b,:]
else:
r[:,:] = z[:,:]
return R
return F
arr = [
(14,14,16),
(17,17,19),
(20,20,22),
(23,23,25),
(26,26,28),
(29,29,31),
(32,32,34),
(35,35,37),
]
e = E.clone()
for P in arr:
E[P[0],:] = 0.0
layers = encoder.layers
for i in range(len(layers)):
f = layers[i].forward
F = G(f)
F._f_ = f
layers[i].forward = F
try:
h(*ar,**kw)
c = True
for P in arr:
p = P[0]
E[p,:] = e[p,:]
a = P[1]
b = P[2]
h(*ar,**kw)
E[p,:] = 0.0
c = False
r = h(*ar,**kw)
finally:
for i in range(len(layers)):
layers[i].forward = layers[i].forward._f_
return r
encoder.forward = H
try:
proc = process_images(p)
finally:
encoder.forward = h
return proc
#EOF
(I didn't test it well; it might leak memory or leave the model broken; it is better to always restart WebUI just to be sure that nothing left from previous runs).
Actual token positions currently are not exported to UI, I set them as constant array in the code, tuned for this exact prompt:
full-body photo, beautiful girl is sitting on the floor, red eyes, green shirt, yellow skirt, blue shoes, white hair, black background, orange gloves, purple light, best quality, masterpiece
Algorithm is:
- Hook forward() of Clip and all of its layers. On forward call:
- Replace targets with zero-vectors (keeping original clones).
- Clip forward, but store results after each layer.
- For each target token group (the color and some of its next/previous tokens; currently I'm doing just the next one):
5.1. Restore target token.
5.2. Clip forward, but replace results for each layer: restore all vectors (by their saved versions) except for the current group; keep the current group result separately.
5.3 Replace target back with zero, so the next group would be independent. - Clip forward once again, this time ignoring all layers, replacing them with merged results from all groups.
- Unhook Clip and return the result.
I am not happy with its effect! At actually as good and as bad as your very cutoff with weight=1 and "cutoff strongly".
No clear additional benefits…
For example, this is my test (model: suzumehachi, seed: 2277229613, negative: cropped, out-of-frame, bad anatomy, worst quality
):
Original:
Without restoring target tokens:
My main result with restored tokens:
You cutoff with default settings gives this when targeted at
red, green, yellow, blue, white, black, orange, purple
With Cutoff strong and Weight 1, it gives:
And this one is for weight = 2:
For me, it is more or less the same thing. My method doesn't add anything valuable for preventing color shifts.
But now I have another idea!
- Call U-Net, either on final cutoff result, or with zeroed tokens (whichever would be better).
- Grab cross-attention maps for each object that we wanted to bind color to ("eyes", "shirt", "skirt", "shoes", "hair", "background", "gloves", "light")
- Copy those maps to color tokens accordingly.
- Call U-Net with adjusted cross-attention maps. (Or do this on the same step, I don't know how such attention-patching is actually working).
Will this help U-Net to not shift color? This way, not only Clip will process "red" without knowing anything about "green" or other colors, but U-Net will also attend to "red" on the same regions where it attends to "eyes" but not "shirt" or anything else.
from sd-webui-cutoff.
Related Issues (20)
- Add cutoff to the diffusers HOT 5
- do you wish to have this extension added to the webui Extension index? HOT 1
- AssertionError HOT 4
- PLEASE add Cutoff to Vladmantic HOT 12
- Success Receive
- For Posterity: Mismatched tensor size error
- I have a hard time understanding what this CUTS OFF HOT 1
- Can't make it work. HOT 2
- del HOT 1
- Unable to see the difference HOT 3
- SDXL support
- Assertion error with SDXL HOT 1
- SD1.5をcolab環境下で動かすと以下のエラーが発生 HOT 3
- Request to add settings of default target tokens.
- Set settings from PNG Info automatically
- Does not work if there is only one color selected
- [BUG?]: import name issue
- webui style pnginfo
- Use this extension with SDXL, reports an error. HOT 1
- Working with multiple characters.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sd-webui-cutoff.