google / prompt-to-prompt Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Im running with diffusers 0.14.0
the "NoneType" call error is from the attention function in the forward pass of CrossAttnDownBlock2D
up-to-date diffusers 0.11.1 has a bug with attention mask
by the way, can you show how to train the null-text inversion using 16G GPU, like V100? also shows CUDA memory is not enough...
Thanks for your research :)
i get an error:
`
162 # set timesteps
163 extra_set_kwargs = {"offset": 1}
--> 164 model.scheduler.set_timesteps(num_inference_steps, **extra_set_kwargs)
165 for t in tqdm(model.scheduler.timesteps):
166 latents = diffusion_step(model, controller, latents, context, t, guidance_scale, low_resource)
TypeError: set_timesteps() got an unexpected keyword argument 'offset'`
Could you please help me to solve it. Thank you in advance
I am trying to run any of the jupyter notebooks to test the code but I am facing this error in the line where the prompts are passed to the model. The code cell and the error are the following:
g_cpu = torch.Generator().manual_seed(8888)
prompts = ["A painting of a squirrel eating a burger"]
controller = AttentionStore()
image, x_t = run_and_display(prompts, controller, latent=None, run_baseline=False, generator=g_cpu)
show_cross_attention(controller, res=16, from_where=("up", "down"))
TypeError Traceback (most recent call last)
Cell In[9], line 4
2 prompts = ["A painting of a squirrel eating a burger"]
3 controller = AttentionStore()
----> 4 image, x_t = run_and_display(prompts, controller, latent=None, run_baseline=False, generator=g_cpu)
5 show_cross_attention(controller, res=16, from_where=("up", "down"))
Cell In[6], line 6, in run_and_display(prompts, controller, latent, run_baseline, generator)
4 images, latent = run_and_display(prompts, EmptyControl(), latent=latent, run_baseline=False, generator=generator)
5 print("with prompt-to-prompt")
----> 6 images, x_t = ptp_utils.text2image_ldm_stable(ldm_stable, prompts, controller, latent=latent, num_inference_steps=NUM_DIFFUSION_STEPS, guidance_scale=GUIDANCE_SCALE, generator=generator, low_resource=LOW_RESOURCE)
7 ptp_utils.view_images(images)
8 return images, x_t
File ~/.conda/envs/prompt/lib/python3.8/site-packages/torch/autograd/grad_mode.py:27, in _DecoratorContextManager.call..decorate_context(*args, **kwargs)
24 @functools.wraps(func)
25 def decorate_context(*args, **kwargs):
26 with self.clone():
---> 27 return func(*args, **kwargs)
File ~/Downloads/prompt-to-prompt/ptp_utils.py:167, in text2image_ldm_stable(model, prompt, controller, num_inference_steps, guidance_scale, generator, latent, low_resource)
165 model.scheduler.set_timesteps(num_inference_steps)
166 for t in tqdm(model.scheduler.timesteps):
--> 167 latents = diffusion_step(model, controller, latents, context, t, guidance_scale, low_resource)
...
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
TypeError: forward() got an unexpected keyword argument 'encoder_hidden_states'
Any suggestions?
When I run this code, an error occurred in the function of latent2image.
image = car.decode(latents)[“samples”]
too may indices for tensor of dimension 4
DDIM inversion...
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3505, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_585007/266642345.py", line 3, in
(image_gt, image_enc), x_t, uncond_embeddings = null_inversion.invert(image_path, prompt, offsets=(0,0,200,0), verbose=True)
File "/tmp/ipykernel_585007/262494972.py", line 168, in invert
image_rec, ddim_latents = self.ddim_inversion(image_gt)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/tmp/ipykernel_585007/262494972.py", line 125, in ddim_inversion
ddim_latents = self.ddim_loop(latent)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/tmp/ipykernel_585007/262494972.py", line 112, in ddim_loop
noise_pred = self.get_noise_pred_single(latent, t, cond_embeddings)
File "/tmp/ipykernel_585007/262494972.py", line 46, in get_noise_pred_single
noise_pred = self.model.unet(latents, t, encoder_hidden_states=context)["sample"]
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/diffusers/models/unet_2d_condition.py", line 582, in forward
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/diffusers/models/unet_2d_blocks.py", line 837, in forward
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/diffusers/models/transformer_2d.py", line 265, in forward
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/diffusers/models/attention.py", line 291, in forward
class FeedForward(nn.Module):
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'encoder_hidden_states'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 2102, in showtraceback
stb = self.InteractiveTB.structured_traceback(
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1310, in structured_traceback
return FormattedTB.structured_traceback(
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1199, in structured_traceback
return VerboseTB.structured_traceback(
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/ultratb.py", line 1052, in structured_traceback
formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/ultratb.py", line 978, in format_exception_as_a_whole
frames.append(self.format_record(record))
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/ultratb.py", line 878, in format_record
frame_info.lines, Colors, self.has_colors, lvals
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/IPython/core/ultratb.py", line 712, in lines
return self._sd.lines
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/stack_data/utils.py", line 145, in cached_property_wrapper
value = obj.dict[self.func.name] = self.func(obj)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/stack_data/core.py", line 698, in lines
pieces = self.included_pieces
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/stack_data/utils.py", line 145, in cached_property_wrapper
value = obj.dict[self.func.name] = self.func(obj)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/stack_data/core.py", line 649, in included_pieces
pos = scope_pieces.index(self.executing_piece)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/stack_data/utils.py", line 145, in cached_property_wrapper
value = obj.dict[self.func.name] = self.func(obj)
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/stack_data/core.py", line 628, in executing_piece
return only(
File "/home/nras/miniconda3/envs/py3.8/lib/python3.8/site-packages/executing/executing.py", line 164, in only
raise NotOneValueFound('Expected one value, found 0')
executing.executing.NotOneValueFound: Expected one value, found 0
@google-admin , thanks for your great job and sharing. But when I click the link "https://github.com/google/prompt-to-prompt/blob/main/%22./prompt-to-prompt_stable.ipynb%22", it came to an error "404".
I have installed the required diffusers and transformers, but occurs:
TypeError Traceback (most recent call last)
in
1 from typing import Optional, Union, Tuple, List, Callable, Dict
2 import torch
----> 3 from diffusers import StableDiffusionPipeline
4 import torch.nn.functional as nnf
5 import numpy as np
~/anaconda3/lib/python3.8/site-packages/diffusers/init.py in
24 )
25 from .pipeline_utils import DiffusionPipeline
---> 26 from .pipelines import DDIMPipeline, DDPMPipeline, KarrasVePipeline, LDMPipeline, PNDMPipeline, ScoreSdeVePipeline
27 from .schedulers import (
28 DDIMScheduler,
~/anaconda3/lib/python3.8/site-packages/diffusers/pipelines/init.py in
9
10 if is_transformers_available():
---> 11 from .latent_diffusion import LDMTextToImagePipeline
12 from .stable_diffusion import (
13 StableDiffusionImg2ImgPipeline,
~/anaconda3/lib/python3.8/site-packages/diffusers/pipelines/latent_diffusion/init.py in
4
5 if is_transformers_available():
----> 6 from .pipeline_latent_diffusion import LDMBertModel, LDMTextToImagePipeline
~/anaconda3/lib/python3.8/site-packages/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion.py in
7 import torch.utils.checkpoint
8
----> 9 from transformers.activations import ACT2FN
10 from transformers.configuration_utils import PretrainedConfig
11 from transformers.modeling_outputs import BaseModelOutput
~/anaconda3/lib/python3.8/site-packages/transformers/init.py in
28
29 # Check the dependencies satisfy the minimal versions required.
---> 30 from . import dependency_versions_check
31 from .utils import (
32 OptionalDependencyNotAvailable,
~/anaconda3/lib/python3.8/site-packages/transformers/dependency_versions_check.py in
15
16 from .dependency_versions_table import deps
---> 17 from .utils.versions import require_version, require_version_core
18
19
~/anaconda3/lib/python3.8/site-packages/transformers/utils/init.py in
32 replace_return_docstrings,
33 )
---> 34 from .generic import (
35 ContextManagers,
36 ExplicitEnum,
~/anaconda3/lib/python3.8/site-packages/transformers/utils/generic.py in
31
32 if is_tf_available():
---> 33 import tensorflow as tf
34
35 if is_flax_available():
~/anaconda3/lib/python3.8/site-packages/tensorflow/init.py in
53 from ._api.v2 import autograph
54 from ._api.v2 import bitwise
---> 55 from ._api.v2 import compat
56 from ._api.v2 import config
57 from ._api.v2 import data
~/anaconda3/lib/python3.8/site-packages/tensorflow/_api/v2/compat/init.py in
37 import sys as _sys
38
---> 39 from . import v1
40 from . import v2
41 from tensorflow.python.compat.compat import forward_compatibility_horizon
~/anaconda3/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v1/init.py in
32 from . import autograph
33 from . import bitwise
---> 34 from . import compat
35 from . import config
36 from . import data
~/anaconda3/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v1/compat/init.py in
37 import sys as _sys
38
---> 39 from . import v1
40 from . import v2
41 from tensorflow.python.compat.compat import forward_compatibility_horizon
~/anaconda3/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v1/compat/v1/init.py in
49 from tensorflow._api.v2.compat.v1 import layers
50 from tensorflow._api.v2.compat.v1 import linalg
---> 51 from tensorflow._api.v2.compat.v1 import lite
52 from tensorflow._api.v2.compat.v1 import logging
53 from tensorflow._api.v2.compat.v1 import lookup
~/anaconda3/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v1/lite/init.py in
9
10 from . import constants
---> 11 from . import experimental
12 from tensorflow.lite.python.lite import Interpreter
13 from tensorflow.lite.python.lite import OpHint
~/anaconda3/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v1/lite/experimental/init.py in
8 import sys as _sys
9
---> 10 from . import authoring
11 from tensorflow.lite.python.analyzer import ModelAnalyzer as Analyzer
12 from tensorflow.lite.python.lite import OpResolverType
~/anaconda3/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v1/lite/experimental/authoring/init.py in
8 import sys as _sys
9
---> 10 from tensorflow.lite.python.authoring.authoring import compatible
11
12 del _print_function
~/anaconda3/lib/python3.8/site-packages/tensorflow/lite/python/authoring/authoring.py in
41
42 # pylint: disable=g-import-not-at-top
---> 43 from tensorflow.lite.python import convert
44 from tensorflow.lite.python import lite
45 from tensorflow.lite.python.metrics_wrapper import converter_error_data_pb2
~/anaconda3/lib/python3.8/site-packages/tensorflow/lite/python/convert.py in
31
32 from tensorflow.lite.python import lite_constants
---> 33 from tensorflow.lite.python import util
34 from tensorflow.lite.python import wrap_toco
35 from tensorflow.lite.python.convert_phase import Component
~/anaconda3/lib/python3.8/site-packages/tensorflow/lite/python/util.py in
53 # pylint: disable=unused-import
54 try:
---> 55 from jax import xla_computation as _xla_computation
56 except ImportError:
57 _xla_computation = None
~/anaconda3/lib/python3.8/site-packages/jax/init.py in
90 # These submodules are separate because they are in an import cycle with
91 # jax and rely on the names imported above.
---> 92 from . import image
93 from . import lax
94 from . import nn
~/anaconda3/lib/python3.8/site-packages/jax/image/init.py in
16
17 # flake8: noqa: F401
---> 18 from jax._src.image.scale import (
19 resize,
20 ResizeMethod,
~/anaconda3/lib/python3.8/site-packages/jax/_src/image/scale.py in
18
19 from jax import jit
---> 20 from jax import lax
21 from jax import numpy as jnp
22 import numpy as np
~/anaconda3/lib/python3.8/site-packages/jax/lax/init.py in
322 while_p,
323 )
--> 324 from jax._src.lax.fft import (
325 fft,
326 fft_p,
~/anaconda3/lib/python3.8/site-packages/jax/_src/lax/fft.py in
85
86 @partial(jit, static_argnums=1)
---> 87 def _rfft_transpose(t, fft_lengths):
88 # The transpose of RFFT can't be expressed only in terms of irfft. Instead of
89 # manually building up larger twiddle matrices (which would increase the
~/anaconda3/lib/python3.8/site-packages/jax/api.py in jit(fun, static_argnums, device, backend, donate_argnums)
179 """
180 if FLAGS.experimental_cpp_jit and config.omnistaging_enabled:
--> 181 return _cpp_jit(fun, static_argnums, device, backend, donate_argnums)
182 else:
183 return _python_jit(fun, static_argnums, device, backend, donate_argnums)
~/anaconda3/lib/python3.8/site-packages/jax/api.py in cpp_jit(fun, static_argnums, device, backend, donate_argnums)
365
366 static_argnums = (0,) + tuple(i + 1 for i in static_argnums)
--> 367 cpp_jitted_f = jax_jit.jit(fun, cache_miss, get_device_info,
368 get_jax_enable_x64, get_jax_disable_jit_flag,
369 static_argnums_)
TypeError: jit(): incompatible function arguments. The following argument types are supported:
1. (fun: function, cache_miss: function, get_device: function, static_argnums: List[int], static_argnames: List[str] = [], donate_argnums: List[int] = [], cache: jaxlib.xla_extension.CompiledFunctionCache = None) -> object
Invoked with: <function _rfft_transpose at 0x7f44d1e18ee0>, <function _cpp_jit..cache_miss at 0x7f44d1e18f70>, <function _cpp_jit..get_device_info at 0x7f44d1e1e040>, <function _cpp_jit..get_jax_enable_x64 at 0x7f44d1e1e0d0>, <function _cpp_jit..get_jax_disable_jit_flag at 0x7f44d1e1e160>, (0, 2)
I am wondering what should I do to fix it?
Hi, thanks for your work!
When I tried some real images, the null-text inversion output was fine, but the ptp editing output was totally different from the input, including both the source image and the edited image. Could you please help to explain why this happens? And any advice on how to solve it?
The requirements.txt states diffusers==0.3.0
However we run into the following error in Colab.
Also 0.10.0 does now work
Here is my attempt:
https://colab.research.google.com/drive/1fKzfvdv_7lf8bGpYiPq0AdJ7kUjjKMq6?usp=sharing
Related to #31
Hi, just want to confirm if I'm correct in understanding that the LocalBlend
object used in the "Local Edit" section of the prompt-to-prompt_stable
notebook implements the technique in described in page 10 of the paper (i.e. the mask-based real image editing technique that does not rely on deterministic image inversion).
I am testing the inversion accuracy using COCO dataset but the result is not stable. Only change I did is relaxing the hard-coded 512x512 image size. Do you see any potential risks with that size change? Thanks
Hi awesome paper!
Is it possible to integrate cross attention control mechanism in the memory efficient attention formula?
From what I understand, cross attention control modifies the attention map to make edits, but memory efficient attention doesn't compute attention in the same way, and doesn't explicitly compute the attention map. How can we tweak the memory efficient attention formula to support cross attention control? Is it possible to use both together?
Thank you!
Compare to Imagic, exmple:
This is Null text method:
prompts = ["A dog",
"A sitting dog"
]
This is imagic:
The complete inference code is as follows:
image_path = "./imgs/dog2.png"
prompt = "A dog"
# offsets=(0,0,200,0)
(image_gt, image_enc), x_t, uncond_embeddings = null_inversion.invert(image_path, prompt, verbose=True)
print("Modify or remove offsets according to your image!")
prompts = [prompt]
controller = AttentionStore()
image_inv, x_t = run_and_display(prompts, controller, run_baseline=False, latent=x_t, uncond_embeddings=uncond_embeddings, verbose=False)
print("showing from left to right: the ground truth image, the vq-autoencoder reconstruction, the null-text inverted image")
ptp_utils.view_images([image_gt, image_enc, image_inv[0]])
show_cross_attention(controller, 16, ["up", "down"])
prompts = ["A dog",
"A sitting dog"
]
cross_replace_steps = {'default_': .8, }
self_replace_steps = .7
blend_word = ((('dog',),)) # for local edit
eq_params = {"words": ("sitting", ), "values": (5,)}
controller = make_controller(prompts, False, cross_replace_steps, self_replace_steps, blend_word, eq_params)
images, _ = run_and_display(prompts, controller, run_baseline=False, latent=x_t, uncond_embeddings=uncond_embeddings)
I tried many parameters but couldn't edit this dog,Is this a limitation of the current method??
I'm getting this when running the original Stable Diffusion notebook with diffusers==0.3.0
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:4 │
│ in run_and_display:6 │
│ │
│ /usr/lib/python3/dist-packages/torch/autograd/grad_mode.py:27 in decorate_context │
│ │
│ 24 │ │ @functools.wraps(func) │
│ 25 │ │ def decorate_context(*args, **kwargs): │
│ 26 │ │ │ with self.clone(): │
│ ❱ 27 │ │ │ │ return func(*args, **kwargs) │
│ 28 │ │ return cast(F, decorate_context) │
│ 29 │ │
│ 30 │ def _wrap_generator(self, func): │
│ │
│ /home/ubuntu/p2p2/ptp_utils.py:164 in text2image_ldm_stable │
│ │
│ 161 │ │
│ 162 │ # set timesteps │
│ 163 │ extra_set_kwargs = {"offset": 1} │
│ ❱ 164 │ model.scheduler.set_timesteps(num_inference_steps, **extra_set_kwargs) │
│ 165 │ for t in tqdm(model.scheduler.timesteps): │
│ 166 │ │ latents = diffusion_step(model, controller, latents, context, t, guidance_scale, │
│ 167 │
│ │
│ /home/ubuntu/.local/lib/python3.8/site-packages/diffusers/schedulers/scheduling_pndm.py:171 in │
│ set_timesteps │
│ │
│ 168 │ │ │
│ 169 │ │ self.ets = [] │
│ 170 │ │ self.counter = 0 │
│ ❱ 171 │ │ self.set_format(tensor_format=self.tensor_format) │
│ 172 │ │
│ 173 │ def step( │
│ 174 │ │ self, │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'PNDMScheduler' object has no attribute 'tensor_format'
When updating to diffusers 0.8.0:
AttributeError: 'PNDMScheduler' object has no attribute 'tensor_format'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:4 │
│ in run_and_display:6 │
│ │
│ /usr/lib/python3/dist-packages/torch/autograd/grad_mode.py:27 in decorate_context │
│ │
│ 24 │ │ @functools.wraps(func) │
│ 25 │ │ def decorate_context(*args, **kwargs): │
│ 26 │ │ │ with self.clone(): │
│ ❱ 27 │ │ │ │ return func(*args, **kwargs) │
│ 28 │ │ return cast(F, decorate_context) │
│ 29 │ │
│ 30 │ def _wrap_generator(self, func): │
│ │
│ /home/ubuntu/p2p2/ptp_utils.py:164 in text2image_ldm_stable │
│ │
│ 161 │ │
│ 162 │ # set timesteps │
│ 163 │ extra_set_kwargs = {"offset": 1} │
│ ❱ 164 │ model.scheduler.set_timesteps(num_inference_steps, **extra_set_kwargs) │
│ 165 │ for t in tqdm(model.scheduler.timesteps): │
│ 166 │ │ latents = diffusion_step(model, controller, latents, context, t, guidance_scale, │
│ 167 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: set_timesteps() got an unexpected keyword argument 'offset'
The PNDM Scheduler API in Diffusers has to receive steps_offset
at initialization.
So we either move the "offset = 1" somewhere else, or understand why the first error occurs.
Any ideas?
Dear @amirhertz ,
Thank you for sharing this great work, I really like it.
Do you have plan to release the codes for real image editing in section 4.1?
Thank you for your help.
Best Wishes,
Zongze
running show_cross_attention(controller, 16, ["up", "down"]), it throws KeyError, what's the problem
Hi,I tried to make the standing cat sit down, but nothing changed. I hope to receive help. Thank you very much
I am trying to run the jupyter file and third block give me the following error.
scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False)
MY_TOKEN = ''
LOW_RESOURCE = False
NUM_DDIM_STEPS = 50
GUIDANCE_SCALE = 7.5
MAX_NUM_WORDS = 77
device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
ldm_stable = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=MY_TOKEN, scheduler=scheduler).to(device)
try:
ldm_stable.disable_xformers_memory_efficient_attention()
except AttributeError:
print("Attribute disable_xformers_memory_efficient_attention() is missing")
tokenizer = ldm_stable.tokenizer
TypeError Traceback (most recent call last)
Cell In[3], line 8
6 MAX_NUM_WORDS = 77
7 device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
----> 8 ldm_stable = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=MY_TOKEN, scheduler=scheduler).to(device)
9 try:
10 ldm_stable.disable_xformers_memory_efficient_attention()
File ~/anaconda3/envs/p2p/lib/python3.8/site-packages/diffusers/pipeline_utils.py:373, in DiffusionPipeline.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
370 if issubclass(class_obj, class_candidate):
371 load_method_name = importable_classes[class_name][1]
--> 373 load_method = getattr(class_obj, load_method_name)
375 loading_kwargs = {}
376 if issubclass(class_obj, torch.nn.Module):
TypeError: getattr(): attribute name must be string
any comments?
All the other jupyter file works well.
I've tried running both colabs and I get a sequence of errors:
TypeError: forward() got an unexpected keyword argument 'attention_mask'
.TypeError: getattr(): attribute name must be string
when trying to load the model (ldm_stable = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=MY_TOKEN).to(device)
).What am I missing?
I'm trying to run the Null-text inversion code on colab, but can't seem to install it due to xformers issues.
I think I succeeded installing xformers package using !pip install -U --pre xformers
The versions of the packages are:
Torch version: 1.13.0+cu116
xformers version: 0.0.16rc396
diffusers version: 0.10.0
But I get the following error:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
[<ipython-input-15-ab2e3648a6a0>](https://localhost:8080/#) in <module>
8 ldm_stable = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=MY_TOKEN, scheduler=scheduler).to(device)
9 try:
---> 10 ldm_stable.disable_xformers_memory_efficient_attention()
11 except AttributeError:
12 print("Attribute disable_xformers_memory_efficient_attention() is missing")
7 frames
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in disable_xformers_memory_efficient_attention(self)
829 Disable memory efficient attention as implemented in xformers.
830 """
--> 831 self.set_use_memory_efficient_attention_xformers(False)
832
833 def set_use_memory_efficient_attention_xformers(self, valid: bool) -> None:
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in set_use_memory_efficient_attention_xformers(self, valid)
846 Xformers module = getattr(self, module_name)
847 if isinstance(module, torch.nn.Module):
--> 848 fn_recursive_set_mem_eff(module)
849
850 def enable_attention_slicing(self, slice_size: Optional[Union[str, int]] = "auto"):
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in fn_recursive_set_mem_eff(module)
840
841 for child in module.children():
--> 842 fn_recursive_set_mem_eff(child)
843
844 module_names, _, _ = self.extract_init_dict(dict(self.config))
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in fn_recursive_set_mem_eff(module)
840
841 for child in module.children():
--> 842 fn_recursive_set_mem_eff(child)
843
844 module_names, _, _ = self.extract_init_dict(dict(self.config))
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in fn_recursive_set_mem_eff(module)
840
841 for child in module.children():
--> 842 fn_recursive_set_mem_eff(child)
843
844 module_names, _, _ = self.extract_init_dict(dict(self.config))
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in fn_recursive_set_mem_eff(module)
840
841 for child in module.children():
--> 842 fn_recursive_set_mem_eff(child)
843
844 module_names, _, _ = self.extract_init_dict(dict(self.config))
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py](https://localhost:8080/#) in fn_recursive_set_mem_eff(module)
837 def fn_recursive_set_mem_eff(module: torch.nn.Module):
838 if hasattr(module, "set_use_memory_efficient_attention_xformers"):
--> 839 module.set_use_memory_efficient_attention_xformers(valid)
840
841 for child in module.children():
[/usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py](https://localhost:8080/#) in set_use_memory_efficient_attention_xformers(self, use_memory_efficient_attention_xformers)
289 def set_use_memory_efficient_attention_xformers(self, use_memory_efficient_attention_xformers: bool):
290 if not is_xformers_available():
--> 291 raise ModuleNotFoundError(
292 "Refer to https://github.com/facebookresearch/xformers for more information on how to install"
293 " xformers",
ModuleNotFoundError: Refer to https://github.com/facebookresearch/xformers for more information on how to install xformers
What am I doing wrong?
Any help would be much appreciated.
Hi ! 😃
I have a problem regarding the prompt-to-prompt notebook.
The image of the squirrel changes a little bit between the Cross-Attention Visualization :
g_cpu = torch.Generator().manual_seed(8888)
prompts = ["A painting of a squirrel eating a burger"]
controller = AttentionStore()
image, x_t = run_and_display(prompts, controller, latent=None, run_baseline=False, generator=g_cpu)
show_cross_attention(controller, res=16, from_where=("up", "down"))
and Replacement edit cells :
prompts = ["A painting of a squirrel eating a burger",
"A painting of a lion eating a burger"]
controller = AttentionReplace(prompts, NUM_DIFFUSION_STEPS, cross_replace_steps=.8, self_replace_steps=0.4)
_ = run_and_display(prompts, controller, latent=x_t, run_baseline=True)
sections:
The one on the left was generated from the cell of the Cross-Attention Visualization, and the right one from ****the Replacement edit cell.
If you look closely at the two black circles on the left, you’ll see a difference between the two squirrels, and this is not supposed to happen I guess.
I think you can reproduce the same errors, if you run the following code :
controller = EmptyControl()
g_cpu = torch.Generator().manual_seed(8888)
prompts = ["A painting of a squirrel eating a burger"]
image_1, x_t = run_and_display(prompts, controller, latent=None, run_baseline=False, generator=g_cpu)
g_cpu = torch.Generator().manual_seed(8888)
prompts = ["A painting of a squirrel eating a burger","A painting of a squirrel eating a burger"]
image_2, x_t = run_and_display(prompts, controller, latent=None, run_baseline=False, generator=g_cpu)
The single squirrel in image_1
would be different from the two squirrels generated in image_2
After going a little bit through the code, I suspect it comes from the size of the prompt. Because it contains two sentences, so we have batch_size = 2
in the Replacement cell. I think that’s why it doesn’t generate the exact same picture as if we used only one sentence, so maybe the total size the batch influences the generation of an image from a text even if the prompt remains the same.
The same problem arises when we work with the notebook of Null-text inversion :
Thank you for your help !! 😊
I have used gnochi_mirror.jpeg and the associated prompt "A cat sitting next to a mirror" to try DDIM inversion using SD v1.4 (50 steps) but found the reconstruction quality is fine. What is the settings (and codes if available) to generate the failure case as shown in the Null-Text paper?
Thanks
Dear authors,
Thank you for sharing these interesting & fun research and releasing the source codes!
I was playing with the source codes and found that, for some examples, the result of the attention-swapped image is almost same as the source image (of the attention side).
This was also the case for the example "a photo of a butterfly on a [sth]" as Fig. 5 in the paper.
This, overfitting to the source image, seems to occur even when we swap for small numbers of the diffusion steps as with the parameters of cross_replace_steps>0.1 or self_replace_steps>0.1.
Is it expected to use small numbers (<0.3) for the replace_steps for some examples as in the above?
Is there any guideline for selecting cross_replace_steps and self_replace_steps?
Thank you for reading!
Based on my test of image translation (lion->tiger using AFHQ), PTP is better than NT Inversion in many cases. Wonder if the cause is that NT inversion is too sensitive to parameters like cross_step/self_step?
Thanks for your research :)
I wonder that this robust image editing method by attention injection can be adapted to custom class like Dreambooth research.
Ex) Photo of a cat riding a bicycle -> [V] car : give some custom class information like same car model images.
You might be encoutering error about:
hidden_states = self.attn1(norm_hidden_states, attention_mask=attention_mask) + hidden_states
Just simply rename argument attention_mask to mask, for self.attn2, rename encoder_hidden_states to context will fix the issue.
It is because the register_attention_control function have different argument names with new version of diffusers.
text_config_dict
is provided which will be used to initialize CLIPTextConfig
. The value text_config["id2label"]
will be overriden.TypeError Traceback (most recent call last)
Input In [17], in <cell line: 7>()
5 MAX_NUM_WORDS = 77
6 device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
----> 7 ldm_stable = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=MY_TOKEN).to(device)
8 tokenizer = ldm_stable.tokenizer
File ~/miniconda3/lib/python3.8/site-packages/diffusers/pipeline_utils.py:373, in DiffusionPipeline.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
370 if issubclass(class_obj, class_candidate):
371 load_method_name = importable_classes[class_name][1]
--> 373 load_method = getattr(class_obj, load_method_name)
375 loading_kwargs = {}
376 if issubclass(class_obj, torch.nn.Module):
TypeError: getattr(): attribute name must be string
Thanks for this amazing work. I tested many times on this func, and this func always returns a diagonal matrix with 1s on the diagonal. Why don't you use the built-in func in the torch directly if this is right? If this needs to be corrected, can you help explain this issue?
Greets,
The links under the Quickstart section for prompt-to-prompt_ldm and prompt-to-prompt_stable are broken. Below are the links for each notebook
prompt-to-prompt_ldm
https://github.com/google/prompt-to-prompt/blob/main/%22./prompt-to-prompt_ldm.ipynb%22
prompt-to-prompt_stable
https://github.com/google/prompt-to-prompt/blob/main/%22./prompt-to-prompt_stable.ipynb%22
Thanks,
meefs
This is a really great work, thanks for open sourcing.
Currently I am trying to change the pipeline to support img2img task. Then edit the resulting image.
but failed by trying.
If it is convenient, please tell me how to support img2img task.
Hi, this question is about the linear projections l_Q, l_K, l_V of the attention module in the paper Prompt-to-Prompt. The paper illustrated that the linear projections are learnable. However, in the introduction, it is claimed that "this method does not requires model training". The two expressions seem to contradict each other. How do you learn the papamaters of the l_Q, l_K, l_V?
Thanks for sharing amazing work.
I have few questions after reading the paper, which is as following:
The model scheduler looks like:
model scheduler : PNDMScheduler {
"_class_name": "PNDMScheduler",
"_diffusers_version": "0.8.0",
"beta_end": 0.012,
"beta_schedule": "scaled_linear",
"beta_start": 0.00085,
"clip_sample": false,
"num_train_timesteps": 1000,
"set_alpha_to_one": false,
"skip_prk_steps": true,
"steps_offset": 1,
"trained_betas": null
}
But this part of the code fails :
for t in tqdm(model.scheduler.timesteps):
latents = diffusion_step(model, controller, latents, context, t, guidance_scale,
Error: TypeError: 'NoneType' object is not iterable
First of all, thank you for creating and maintaining this amazing project. I have been using it for a while and it has been really helpful.
I noticed that the current version of the project seems to be compatible with older versions of PyTorch, and I would like to inquire if there are any plans to update the code to support PyTorch 2.0. The latest PyTorch version has introduced several new features and improvements, which could potentially benefit this project as well.
If you are already working on this update or plan to do so in the near future, I would be happy to know the estimated timeline. In case you need any assistance with testing or adapting the code, please let me know. I would be more than happy to contribute and help in any way possible.
Thank you once again for your great work and looking forward to the possibility of using this project with PyTorch 2.0.
Best regards
Does the code support SD1.5?
Hi, I'm creating a comparison between prompt to prompt, null-text Inversion and other editing approaches using images present in their respective papers.
Could you please share them(haykpoghos[at]gmail[dot]com), or make them publicly available?
What's the theory behind the "localblend", can you give me some clues? Thanks
HI, @amirhertz !
Thank you for sharing your cool work!
I have a question about the learning rate of your null-text inversion. According to a notebook, the learning rate is set blow. However, in your paper, the learning rate is set to 0.01.
optimizer = Adam([uncond_embeddings], lr=1e-2 * (1. - i / 100.))
where for i in range(NUM_DDIM_STEPS):
.
If we set NUM_DDIM_STEPS over 101, the learning rate gets negative.
My question is that can we modify lr=1e-2
instead of 1e-2 * (1. - i / 100.)
?
Excellent job!
I have some questions about Global null-text Inversion. I attempted to implement this algorithm, but I can't do a good reconstruction of the final optimized null-text embedding even if I change the number of optimization steps from 7500 to 10000. Could you kindly release the codes for this algorithm? Or is there any code snippet. Thanks!
I solved the problem and I closed it.
Links to the notebooks in the Quick Start section of the Readme file are not working.
Hi,
Thanks for this wonderful work! I have a question about the equation of deterministic DDIM sampling in the Null-text Inversion paper.
Based on my understanding, deterministic DDIM sampling is to set the
If you rewrite this equation into the Null-text Inversion paper version, it should be:
Which is different from it the Null-text Inversion paper.
It may be my understanding is wrong. I would very much appreciate it if you could point me in the right direction!
Thanks,
Jueqi
Are there any instructions on how to get this code working for half precision? If I'm not mistaken, diffusers==0.3.0
might be problematic for this (I think the VAE couldn't handle it) so I upgraded it the diffusers version which should fix that. Currently running into other errors that I'm slowly debugging. A little worried that the version upgrading might be causing more problems than necessary, so if there are specific instructions on how to get this code working for half precision that would be great to hear.
Hi, the null text is optimized over the different timesteps. I am wondering whether it is an alternative solution to optimize unet itself over different timesteps only for this kind of condition? (copy unet and frozen it ahead, then for normal text input, use the frozen ones; for null text, use the optimized unet).
hoq run null text inversion in colab T4 gpu without CUDA out of memory
if not possible how many gigabytes do i need
Hi, I found that attention map swapping is performed after the softmax operation. In that case, the sum of those similarities could not be equal to 1. I wonder if the authors have tried to conduct attention map swapping before the softmax operation.
Hello, I have tried to run this application, but it always gives me this error:
TypeError Traceback (most recent call last)
in
5 MAX_NUM_WORDS = 77
6 device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
----> 7 ldm_stable = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=MY_TOKEN).to(device)
8 tokenizer = ldm_stable.tokenizer
/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
371 load_method_name = importable_classes[class_name][1]
372
--> 373 load_method = getattr(class_obj, load_method_name)
374
375 loading_kwargs = {}
TypeError: getattr(): attribute name must be strin
Could you please help me to solve it. Thank you in advance
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.