deep-floyd / if Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
from deepfloyd_if.modules.t5 import T5Embedder
if_I = IFStageI('IF-I-XL-v1.0', device='cuda:0')
if_II = IFStageII('IF-II-L-v1.0', device='cuda:1')
if_III = StableStageIII('stable-diffusion-x4-upscaler', device='cuda:2')
t5 = T5Embedder(device="cuda:3")
I love the new model but certainly miss custom resolution and aspect ratios…. Any way to do it yet????
I'm aware that what does or does not constitute "open source" is somewhat contentious, but in my understanding requiring people to sign up for a third party account, consenting to a license through a third party service and using a third party access token to use the supposedly "open" software is pushing the concept of openness past the breaking point.
Deep Floyd are, of course, perfectly in their rights to impose any restrictions and requirements they like, but to then go on and advertise a release as open source for the community credit seems at least a little bit disingenuous.
This is not clear.
"
If you are using torch>=2.0.0
, make sure to delete all enable_xformers_memory_efficient_attention()
functions.
"
Kinda a newbie, to github/huggingface/colab/Everything
When going to this link https://huggingface.co/DeepFloyd/IF-I-IF-v1.0/resolve/main/text_encoder/config.json
it responds with Repo not found
│ /usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py:872 in │
│ load_checkpoint_in_model │
│ │
│ 869 │ """ │
│ 870 │ tied_params = find_tied_parameters(model) │
│ 871 │ if offload_folder is None and device_map is not None and "disk" in device_map.values │
│ ❱ 872 │ │ raise ValueError( │
│ 873 │ │ │ "At least one of the model submodule will be offloaded to disk, please pass │
│ 874 │ │ ) │
│ 875 │ elif offload_folder is not None and device_map is not None and "disk" in device_map. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: At least one of the model submodule will be offloaded to disk, please pass along an offload_folder
.
When running on colab, modified demo code
I actually have been playing with both XL and M models to see speed vs quality differences with the models.
So I now loaded XL model again during the same session.
I have been flush()ing and del ing the pipes and everything.
Anyway, line giving me errors is:
text_encoder = T5EncoderModel.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
subfolder="text_encoder",
device_map="auto",
load_in_8bit=True,
variant="8bit"
)
pipe = IFImg2ImgPipeline.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
text_encoder=text_encoder,
unet=None,
device_map="auto"
)
prompt_embeds, negative_embeds = pipe.encode_prompt(prompt)
#free some memory
del pipe
del text_encoder
for image in images:
flush()
pipe = IFImg2ImgPipeline.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
text_encoder=None,
variant="fp16",
torch_dtype=torch.float16,
device_map="auto",
offload_folder = '/content/offload' #THIS IS APPARENTLY IGNORED? SHOULD IT BE IGNORED?
)
Would it be possible to please post the UNetModel(**model_params)
so devs can work on integrating/optimizing already just with randomly initialized weights until the actual ones are released?
Would be great to allow testing optimization ideas and things like that but hard without knowing the exact size, and I couldn't find that in the code currently unless I missed it.
Hi, Thanks for releasing awesome model.
In stage 3, right now we are using "stable-diffusion-x4-upscaler". Which has a lot of memory requirement.
Can we use "stabilityai/sd-x2-latent-upscaler"? This has small memory footprint and is faster as well.
Running the pip command "pip install deepfloyd_if==1.0.0" on win 10
gives:
ERROR: Could not find a version that satisfies the requirement torch<2.0.0 (from deepfloyd-if) (from versions: 2.0.0)
ERROR: No matching distribution found for torch<2.0.0
So, if I install Pillow>=9.2.0, then I get: Module PIL has not attribute "Resampling"
And then if I downgrade to Pillow==9.0.0 to not get that error, I get deepfloyd-if 1.0.1 requires Pillow>=9.2.0
Hello and thank you for the amazing work you've done on this SOTA text2images. After testing the HF demo I noticed the super-resolution 256 -> 1024 struggle to give good results. Isn't it possible to introduce a middle step like 256 -> 512 -> 1024 instead?
First, thanks for answering my questions.
`from diffusers import DiffusionPipeline
from diffusers.utils import pt_to_pil
import torch
stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", variant="fp16", torch_dtype=torch.float16)
stage_1.enable_xformers_memory_efficient_attention() # remove line if torch.version >= 2.0.0
stage_1.enable_model_cpu_offload()
stage_2 = DiffusionPipeline.from_pretrained(
"DeepFloyd/IF-II-L-v1.0", text_encoder=None, variant="fp16", torch_dtype=torch.float16
)
stage_2.enable_xformers_memory_efficient_attention() # remove line if torch.version >= 2.0.0
stage_2.enable_model_cpu_offload()
safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "watermarker": stage_1.watermarker}
stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules, torch_dtype=torch.float16)
stage_3.enable_xformers_memory_efficient_attention() # remove line if torch.version >= 2.0.0
stage_3.enable_model_cpu_offload()
prompt = 'a photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the eiffel tower holding a sign that says "very deep learning"'
prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)
generator = torch.manual_seed(0)
image = stage_1(prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
pt_to_pil(image)[0].save("./if_stage_I.png")
image = stage_2(
image=image, prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt"
).images
pt_to_pil(image)[0].save("./if_stage_II.png")
image = stage_3(prompt=prompt, image=image, generator=generator, noise_level=100).images
image[0].save("./if_stage_III.png")`
but when I followed this code :
`from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
from deepfloyd_if.modules.t5 import T5Embedder
device = 'cuda:0'
if_I = IFStageI('IF-I-XL-v1.0', device=device)
if_II = IFStageII('IF-II-L-v1.0', device=device)
if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device)
t5 = T5Embedder(device="cpu")
from deepfloyd_if.pipelines import dream
prompt = 'ultra close-up color photo portrait of rainbow owl with deer horns in the woods'
count = 4
result = dream(
t5=t5, if_I=if_I, if_II=if_II, if_III=if_III,
prompt=[prompt]*count,
seed=42,
if_I_kwargs={
"guidance_scale": 7.0,
"sample_timestep_respacing": "smart100",
},
if_II_kwargs={
"guidance_scale": 4.0,
"sample_timestep_respacing": "smart50",
},
if_III_kwargs={
"guidance_scale": 9.0,
"noise_level": 20,
"sample_timestep_respacing": "75",
},
)
if_III.show(result['III'], size=14)
`
On the example notebook you are missing
!pip install protobuf==3.20.1
just add that after the other pip installs and before t5 and it'll work great.
also if you're using a docker image make sure to use:
nvidia/cuda:11.7.1-cudnn8-devel-ubuntu22.04
ValueError:
Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to
fit
the quantized model. If you want to dispatch the model on the CPU or the disk while keeping
these modules in 32-bit, you need to set load_in_8bit_fp32_cpu_offload=True
and pass a
custom
device_map
to from_pretrained
. Check
https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-
cpu-and-gpu
for more details.
Running through one of the examples, and finding the following error related to the transformer version:
Traceback (most recent call last):
File "test3.py", line 9, in <module>
stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-M-v1.0", variant="fp16", torch_dtype=torch.float16)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 1039, in from_pretrained
loaded_sub_model = load_sub_model(
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 431, in load_sub_model
raise ImportError(
ImportError: When passing `variant='fp16'`, please make sure to upgrade your `transformers` version to at least 4.27.0.dev0
If appears that 4.25.1 is the version installed when using the requirements.txt file and following the README instructions.
I'm currently rerunning now (after removing 4.25.1 and installing transformers 4.28.1), however would 4.28.1 be compatible or would we need to keep the library under a certain version?
Thanks! : )
Sharing the sample code I've been utilizing to test:
from diffusers import DiffusionPipeline
from diffusers.utils import pt_to_pil
import torch
from huggingface_hub import login
login()
# stage 1
stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-M-v1.0", variant="fp16", torch_dtype=torch.float16)
stage_1.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
stage_1.enable_model_cpu_offload()
# stage 2
stage_2 = DiffusionPipeline.from_pretrained(
"DeepFloyd/IF-II-M-v1.0", text_encoder=None, variant="fp16", torch_dtype=torch.float16
)
stage_2.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
stage_2.enable_model_cpu_offload()
# stage 3
safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "watermarker": stage_1.watermarker}
stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules, torch_dtype=torch.float16)
stage_3.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
stage_3.enable_model_cpu_offload()
Getting an impression that the images with well rendered text have been cherry picked. I have done a few experiments and either text is nowhere to be seen in images, or it is badly misspelled.
Thanks for any insights.
I have tried all kinds of combinations torch 1.13.1 and 2.0.0, CUDA 11.3 and CUDA 11.8.
torch.matmul fails on a GPU.
It would be very nice to have a centralized (GitHub discussions tab for this repo) place to have discussions about getting the code up and running it, without discussions being divided among random subreddits and discord servers.
When I tried to open and try the notebook (via jupyter notebook) I've got the following error message:
Error loading notebook
Unreadable Notebook: /home/ogem/codes/public/2023/IF/notebooks/pipes-DeepFloyd-IF.ipynb NotJSONError("Notebook does not appear to be JSON: 'version https://git-lfs.github.com/spec...")
Is there a json syntax error? Or maybe there is another way to open and use the notebook?
I download the model from huggingface:https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/discussions?status=all
but it doesn't work when I run the model in stable diffusion web ui with the sample of 'a photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the eiffel tower holding a sign that says "very deep learning"'
Congrats! Super great work!
I've noticed that you're currently using the original DDPM scheduler, which is rather slow. It would be much faster if we could apply DPM-Solver++ into this work to accelerate the sampling.
Note that the original DPM-Solver++ may have numerical issues when using the cosine beta schedule, and I've added a fix here: https://github.com/LuChengTHU/dpm-solver/blob/5c6ee9f1e6b60c8c54f955fbaab0a6717fc2b75b/dpm_solver_pytorch.py#L105
I'm happy to help to integrate DPM-Solver++ into IF when the model is released :)
AssertionError Traceback (most recent call last)
Cell In[24], line 4
1 count = 4
2 prompt = 'a boy'
----> 4 result = style_transfer(
5 t5=t5, if_I=if_I, if_II=if_II, if_III=if_III,
6 support_pil_img=zkc,
7 prompt=[prompt]*count,
8 style_prompt=[
9 f'in style lego',
10 f'in style zombie',
11 f'in style origami',
12 f'in style anime',
13 ],
14 seed=42,
15 if_I_kwargs={
16 "guidance_scale": 10.0,
17 "sample_timestep_respacing": "10,10,10,10,10,0,0,0,0,0",
18 'support_noise_less_qsample_steps': 5,
19 'positive_mixer': 0.8,
20 },
21 if_II_kwargs={
22 "guidance_scale": 4.0,
23 "sample_timestep_respacing": 'smart50',
24 "support_noise_less_qsample_steps": 5,
25 'positive_mixer': 1.0,
26 },
27 )
28 if_I.show(result['III'], 2, 14)
File ~/miniconda3/envs/if/lib/python3.10/site-packages/deepfloyd_if/pipelines/style_transfer.py:91, in style_transfer(t5, if_I, if_II, if_III, support_pil_img, style_prompt, prompt, negative_prompt, seed, if_I_kwargs, if_II_kwargs, if_III_kwargs, progress, return_tensors, disable_watermark)
87 if_II_kwargs['progress'] = progress
89 if_II_kwargs['support_noise'] = mid_res
---> 91 stageII_generations, _meta = if_II.embeddings_to_image(**if_II_kwargs)
92 pil_images_II = if_II.to_images(stageII_generations, disable_watermark=disable_watermark)
94 result['II'] = pil_images_II
File ~/miniconda3/envs/if/lib/python3.10/site-packages/deepfloyd_if/modules/stage_II.py:26, in IFStageII.embeddings_to_image(self, low_res, t5_embs, style_t5_embs, positive_t5_embs, negative_t5_embs, batch_repeat, aug_level, dynamic_thresholding_p, dynamic_thresholding_c, sample_loop, sample_timestep_respacing, guidance_scale, img_scale, positive_mixer, progress, seed, sample_fn, **kwargs)
21 def embeddings_to_image(
22 self, low_res, t5_embs, style_t5_embs=None, positive_t5_embs=None, negative_t5_embs=None, batch_repeat=1,
23 aug_level=0.25, dynamic_thresholding_p=0.95, dynamic_thresholding_c=1.0, sample_loop='ddpm',
24 sample_timestep_respacing='smart50', guidance_scale=4.0, img_scale=4.0, positive_mixer=0.5,
25 progress=True, seed=None, sample_fn=None, **kwargs):
---> 26 return super().embeddings_to_image(
27 t5_embs=t5_embs,
28 low_res=low_res,
29 style_t5_embs=style_t5_embs,
30 positive_t5_embs=positive_t5_embs,
31 negative_t5_embs=negative_t5_embs,
32 batch_repeat=batch_repeat,
33 aug_level=aug_level,
34 dynamic_thresholding_p=dynamic_thresholding_p,
35 dynamic_thresholding_c=dynamic_thresholding_c,
36 sample_loop=sample_loop,
37 sample_timestep_respacing=sample_timestep_respacing,
38 guidance_scale=guidance_scale,
39 positive_mixer=positive_mixer,
40 img_size=256,
41 img_scale=img_scale,
42 progress=progress,
43 seed=seed,
44 sample_fn=sample_fn,
45 **kwargs
46 )
File ~/miniconda3/envs/if/lib/python3.10/site-packages/deepfloyd_if/modules/base.py:181, in IFBaseModule.embeddings_to_image(self, t5_embs, low_res, style_t5_embs, positive_t5_embs, negative_t5_embs, batch_repeat, dynamic_thresholding_p, sample_loop, sample_timestep_respacing, dynamic_thresholding_c, guidance_scale, aug_level, positive_mixer, blur_sigma, img_size, img_scale, aspect_ratio, progress, seed, sample_fn, support_noise, support_noise_less_qsample_steps, inpainting_mask, **kwargs)
179 else:
180 assert support_noise_less_qsample_steps < len(diffusion.timestep_map) - 1
--> 181 assert support_noise.shape == (1, 3, image_h, image_w)
182 q_sample_steps = torch.tensor([int(len(diffusion.timestep_map) - 1 - support_noise_less_qsample_steps)])
183 support_noise = support_noise.cpu()
However I am using a station with 4 x A100(40G)
if_I = IFStageI('/IF/deepfloyd-if/IF-I-XL-v1.0', device='cuda:0')
if_II = IFStageII('/IF/deepfloyd-if/IF-II-L-v1.0', device='cuda:1')
if_III = StableStageIII('/IF/deepfloyd-if/stable-diffusion-x4-upscaler', device='cuda:2')
t5 = T5Embedder(device="cuda:3")
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB (GPU 0; 39.39 GiB total capacity; 29.37 GiB already allocated; 6.90 GiB free; 30.95 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Hi!
I;ve tried to launch the inpainting example from the internal notebook and got error.
`
----> 1 result = inpainting(
2 t5=t5, if_I=if_I,
3 if_II=if_II,
4 if_III=if_III,
5 support_pil_img=raw_pil_image.resize((128, 128), resample=Image.BICUBIC),
6 inpainting_mask=inpainting_mask,
7 prompt=[
8 'blue sunglasses',
9 ],
10 seed=42,
11 if_I_kwargs={
12 "guidance_scale": 7.0,
13 "sample_timestep_respacing": "10,10,10,10,10,0,0,0,0,0",
14 'support_noise_less_qsample_steps': 0,
15 },
16 if_II_kwargs={
17 "guidance_scale": 4.0,
18 'aug_level': 0.0,
19 "sample_timestep_respacing": '100',
20 },
21 )
22 if_I.show(result['I'], 2, 3)
23 if_I.show(result['II'], 2, 6)
File ~/miniconda3/envs/df/lib/python3.8/site-packages/deepfloyd_if/pipelines/inpainting.py:61, in inpainting(t5, if_I, if_II, if_III, support_pil_img, prompt, inpainting_mask, negative_prompt, seed, if_I_kwargs, if_II_kwargs, if_III_kwargs, progress, return_tensors, disable_watermark)
57 if_I_kwargs['negative_t5_embs'] = negative_t5_embs
59 if_I_kwargs['support_noise'] = low_res
---> 61 inpainting_mask_I = img_as_bool(resize(inpainting_mask[0].cpu(), (3, image_h, image_w)))
62 inpainting_mask_I = torch.from_numpy(inpainting_mask_I).unsqueeze(0).to(if_I.device)
64 if_I_kwargs['inpainting_mask'] = inpainting_mask_I
File ~/miniconda3/envs/df/lib/python3.8/site-packages/skimage/transform/_warps.py:154, in resize(image, output_shape, order, mode, cval, clip, preserve_range, anti_aliasing, anti_aliasing_sigma)
149 image = image.astype(np.float32)
151 if anti_aliasing is None:
152 anti_aliasing = (
153 not input_type == bool and
--> 154 not (np.issubdtype(input_type, np.integer) and order == 0) and
155 any(x < y for x, y in zip(output_shape, input_shape)))
157 if input_type == bool and anti_aliasing:
158 raise ValueError("anti_aliasing must be False for boolean images")
File ~/miniconda3/envs/df/lib/python3.8/site-packages/numpy/core/numerictypes.py:416, in issubdtype(arg1, arg2)
358 r"""
359 Returns True if first argument is a typecode lower/equal in type hierarchy.
360
(...)
413
414 """
415 if not issubclass_(arg1, generic):
--> 416 arg1 = dtype(arg1).type
417 if not issubclass_(arg2, generic):
418 arg2 = dtype(arg2).type
TypeError: Cannot interpret 'torch.float32' as a data type
`
I assume something wrong with scikit-image, not sure what
Please, assist.
Thanks!
Jupyter notebook links are broken on the readme.
They point here https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/blob/main/notebooks/pipes-DeepFloyd-IF-v1.0.ipynb
How can we fine-tune it on a single subject with some 10-15 photos and instance/class prompts?
where is the api for save image of IF?
error info:
from deepfloyd_if.modules.t5 import T5Embedder
device = 'cuda:0'
if_I = IFStageI('IF-I-XL-v1.0', device=device)
D:\AiTools\DeepFloydIF\IF\vnev\lib\site-packages\huggingface_hub\file_download.py:1104: FutureWarning: Theforce_filename
parameter is deprecated as a new caching system, which keeps the filenames as they are on the Hub, is now in place.
warnings.warn(
if_II = IFStageII('IF-II-L-v1.0', device=device)
if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device)
Traceback (most recent call last):
File "", line 1, in
File "D:\AiTools\DeepFloydIF\IF\vnev\lib\site-packages\deepfloyd_if\modules\stage_III_sd_x4.py", line 34, in init
self.model = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch_dtype, token=self.hf_token)
File "D:\AiTools\DeepFloydIF\IF\vnev\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 884, in from_pretrained
cached_folder = cls.download(
File "D:\AiTools\DeepFloydIF\IF\vnev\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1208, in download
config_file = hf_hub_download(
File "D:\AiTools\DeepFloydIF\IF\vnev\lib\site-packages\huggingface_hub\utils_validators.py", line 112, in _inner_fn
validate_repo_id(arg_value)
File "D:\AiTools\DeepFloydIF\IF\vnev\lib\site-packages\huggingface_hub\utils_validators.py", line 166, in validate_repo_id
raise HFValidationError(
huggingface_hub.utils.validators.HFValidationError: Repo id must use alphanumeric chars or '-', '', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'stabilityai\stable-diffusion-x4-upscaler'.
No model is found on hugging face in given user
I got error when running the sample code on hf: https://huggingface.co/docs/diffusers/api/pipelines/if#diffusers.IFImg2ImgSuperResolutionPipeline
I already logging in to huggingface. It seems that there is a bug when mapping the model name? I never used IF-I-XL, I used IF-I-IF instead.
I am running the dream pipeline on 4 water-cooled Maxwell Titan X, with each stage on its own GPU. It is slow as molasses. It is painful to watch.
There are no OOMs, stages do fit into 12.3MiB that each Titan has.
Any suggestions are welcome.
Manjaro Linux, 4090, amd cpu.
I created a deepfloyd env python=3.10, activated it
pip install -U huggingface_hub diffusers transformers safetensors sentencepiece accelerate bitsandbytes torch
started python and got the token from huggingface
created the script file and ran it. got these errors:
Can someone just point me in the right direction?
2023-04-29 17:11:30.330731: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-29 17:11:30.466991: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Traceback (most recent call last):
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1146, in _get_module
return importlib.import_module("." + module_name, self.__name__)
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 22, in <module>
from ...image_transforms import (
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/transformers/image_transforms.py", line 48, in <module>
import tensorflow as tf
File "/home/vhey/.local/lib/python3.10/site-packages/tensorflow/__init__.py", line 37, in <module>
from tensorflow.python.tools import module_util as _module_util
File "/home/vhey/.local/lib/python3.10/site-packages/tensorflow/python/__init__.py", line 37, in <module>
from tensorflow.python.eager import context
File "/home/vhey/.local/lib/python3.10/site-packages/tensorflow/python/eager/context.py", line 27, in <module>
import six
ModuleNotFoundError: No module named 'six'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/vhey/deepfloyd/txt2img.py", line 1, in <module>
from diffusers import DiffusionPipeline
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/diffusers/__init__.py", line 58, in <module>
from .pipelines import (
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/diffusers/pipelines/__init__.py", line 45, in <module>
from .alt_diffusion import AltDiffusionImg2ImgPipeline, AltDiffusionPipeline
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/diffusers/pipelines/alt_diffusion/__init__.py", line 32, in <module>
from .pipeline_alt_diffusion import AltDiffusionPipeline
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/diffusers/pipelines/alt_diffusion/pipeline_alt_diffusion.py", line 20, in <module>
from transformers import CLIPImageProcessor, XLMRobertaTokenizer
File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1137, in __getattr__
value = getattr(module, name)
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1136, in __getattr__
module = self._get_module(self._class_to_module[name])
File "/home/vhey/miniconda3/envs/deepfloyd/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1148, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.clip.image_processing_clip because of the following error (look up to see its traceback):
No module named 'six'
I'm wonder if this section of the license is supposed to be included? It appears to say that any removal of the content filters is not allowed under any circumstances. If that is the case, then it's only going to trigger conflict with the community immediately after the release of the weights.
2. All persons obtaining a copy or substantial portion of the Software,
a modified version of the Software (or substantial portion thereof), or
a derivative work based upon this Software (or substantial portion thereof)
must not delete, remove, disable, diminish, or circumvent any inference filters or
inference filter mechanisms in the Software, or any portion of the Software that
implements any such filters or filter mechanisms.
https://github.com/deep-floyd/IF/blob/af64403da0ae2667e5d40670f4014de04bd5c523/LICENSE
Is there a list of commands somewhere?
In Windows, when running the notebook of the IF-I-XL-v.1.0 model, the following error occurs when trying to download the stable-diffusion-x4-upscaler:
HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'stabilityai\stable-diffusion-x4-upscaler'.
A quick fix would be to change line 23 in the file [your-venv-name]\Lib\site-packages\deepfloyd_if\modules
to model_id = 'stabilityai/' + self.dir_or_name
the readme lists a minimum of 16GB of vram without the stable-x4 upscaler, 24GB with, however you can run it with the stable-x4 on as little as 6GB of vram using sequential offload on the first stage/text encoder (in fp16) and cpu offload on the second/third stage. you can also run all three stages using cpu offload on 16GB (maybe less). you do need sufficient dram though.
stage_1 = IFPipeline.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
variant="fp16",
torch_dtype=torch.float16,
)
stage_2 = IFSuperResolutionPipeline.from_pretrained(
"DeepFloyd/IF-II-L-v1.0",
text_encoder=None,
variant="fp16",
torch_dtype=torch.float16,
)
stage_3 = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-x4-upscaler", torch_dtype=torch.float16
)
#16 GB
stage_1.enable_model_cpu_offload()
stage_2.enable_model_cpu_offload()
stage_3.enable_model_cpu_offload()
#6 GB
stage_1.enable_sequential_cpu_offload()
stage_2.enable_model_cpu_offload()
stage_3.enable_model_cpu_offload()
i tested this on pytorch2.0.0+cu118 with torch.cuda.set_per_process_memory_fraction()
to limit the amount of vram torch can use.
the sequential offload significantly slows down the first stage, but that's better than not being able to run it at all
Hey, I'm trying to load the model into 24GB VRAM GPU.
This is my code
from diffusers import DiffusionPipeline
from diffusers.utils import pt_to_pil
import torch
stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", torch_dtype=torch.float16)
stage_1.enable_xformers_memory_efficient_attention()
stage_1.enable_model_cpu_offload()
The kernel crashes while loading the model into the memory, I tried loading from deepfloyd_if same thing it also crashes while running the following code.
from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
from deepfloyd_if.modules.t5 import T5Embedder
device = 'cuda:0'
if_I = IFStageI('IF-I-XL-v1.0', device=device)
if_II = IFStageII('IF-II-L-v1.0', device=device)
if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device)
t5 = T5Embedder(device="cpu")
This is the error shown in the notebook,
Canceled future for execute_request message before replies were done The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details.
I tracked memory usage it is not passing 14GB mark, how do I resolve it?
Finetuning code will be released as well? Awesome project btw! Cant wait to train a custom model
Then why is this file behind a login?
https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/resolve/main/config.yml
I just want to run it on my server, use my GPUs, storage, etc...
Tried to follow the instructions, yielded in a total disaster. Each pip pack wants to install its own torch version, and I couldn't get anything to work. Followed the instructions 1:1 multiple times in a few diff fresh envs, to no avail.
Also tried with a fresh new PT2 venv, also to no avail.
Could you please re-test your instructions, on windows preferably? I have an RTX 4090 with 24gb of vram, and I couldn't even get to the loading into vram part.
Can we use FLAN-T5 as a language model?
Those FLAN models can represent English and other languages significantly better in our tests.
"If you already know T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages."
After going through the README instructions, trying the following test script just to get started, however I am consistently receiving an error: NotImplementedError: Memory efficient attention with
xformersis currently not supported when
self.added_kv_proj_dim is defined.
(full traceback shared after test code section):
Testcode:
from diffusers import DiffusionPipeline
from diffusers.utils import pt_to_pil
import torch
# stage 1
stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", variant="fp16", torch_dtype=torch.float16)
stage_1.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
stage_1.enable_model_cpu_offload()
# stage 2
stage_2 = DiffusionPipeline.from_pretrained(
"DeepFloyd/IF-II-L-v1.0", text_encoder=None, variant="fp16", torch_dtype=torch.float16
)
stage_2.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
stage_2.enable_model_cpu_offload()
# stage 3
safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "watermarker": stage_1.watermarker}
stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules, torch_dtype=torch.float16)
stage_3.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
stage_3.enable_model_cpu_offload()
prompt = 'a photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the eiffel tower holding a sign that says "very deep learning"'
# text embeds
prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)
generator = torch.manual_seed(0)
# stage 1
image = stage_1(prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
pt_to_pil(image)[0].save("./if_stage_I.png")
# stage 2
image = stage_2(
image=image, prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt"
).images
pt_to_pil(image)[0].save("./if_stage_II.png")
# stage 3
image = stage_3(prompt=prompt, image=image, generator=generator, noise_level=100).images
image[0].save("./if_stage_III.png")
Error traceback:
Traceback (most recent call last):
File "test2.py", line 8, in <module>00%|████████████████████████████████████████████████████████████████████████| 8.61G/8.61G [1:20:50<00:00, 2.70MB/s]
stage_1.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 1448, in enable_xformers_memory_efficient_attention
self.set_use_memory_efficient_attention_xformers(True, attention_op)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 1474, in set_use_memory_efficient_attention_xformers
fn_recursive_set_mem_eff(module)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 1464, in fn_recursive_set_mem_eff
module.set_use_memory_efficient_attention_xformers(valid, attention_op)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/models/modeling_utils.py", line 227, in set_use_memory_efficient_attention_xformers
fn_recursive_set_mem_eff(module)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/models/modeling_utils.py", line 223, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/models/modeling_utils.py", line 223, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/models/modeling_utils.py", line 223, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/models/modeling_utils.py", line 220, in fn_recursive_set_mem_eff
module.set_use_memory_efficient_attention_xformers(valid, attention_op)
File "${HOME}/miniconda3/envs/deepfloyd/lib/python3.8/site-packages/diffusers/models/attention_processor.py", line 161, in set_use_memory_efficient_attention_xformers
raise NotImplementedError(
NotImplementedError: Memory efficient attention with `xformers` is currently not supported when `self.added_kv_proj_dim` is defined.
How can I ensure the output image size of image to image match the input? Going on the example colab code I use this
original_image = Image.open("input.png")
text_encoder = T5EncoderModel.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
subfolder="text_encoder",
device_map="auto",
load_in_8bit=True,
variant="8bit"
)
pipe = IFImg2ImgPipeline.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
text_encoder=text_encoder,
unet=None,
device_map="auto"
)
prompt = "anime style"
prompt_embeds, negative_embeds = pipe.encode_prompt(prompt)
pipe = IFImg2ImgPipeline.from_pretrained(
"DeepFloyd/IF-I-XL-v1.0",
text_encoder=None,
variant="fp16",
torch_dtype=torch.float16,
device_map="auto"
)
generator = torch.Generator().manual_seed(0)
image = pipe(
image=original_image,
prompt_embeds=prompt_embeds,
negative_prompt_embeds=negative_embeds,
output_type="pt",
generator=generator,
).images
pil_image = pt_to_pil(image)
pil_image[0].save("output.png")
pipe = IFImg2ImgSuperResolutionPipeline.from_pretrained(
"DeepFloyd/IF-II-L-v1.0",
text_encoder=None,
variant="fp16",
torch_dtype=torch.float16,
device_map="auto"
)
image = pipe(
image=image,
original_image=original_image,
prompt_embeds=prompt_embeds,
negative_prompt_embeds=negative_embeds,
generator=generator,
).images
image[0].save("output.png")
Which works, but the output size is always smaller than the input image.
What am I missing?
This is the output for a 550x550 input image.
If possible, please give full code examples too. You have a good initial code snippet on the readme for Text to Image, but then the rest of the examples are incomplete. The same sort of full code examples would be very helpful.
Hello. It looks amazingly promising
I plan to make a tutorial on my channel (https://www.youtube.com/SECourses) but it looks super technical so people wouldn't like
So my questions are
1 st : how hard it would be to be implemetened automatic1111? I dont want my tutorial to become obsolete in few days
2 nd : do you have a gradio script that will make using easier?
3 rd : why watermark is forced? Doesn't make sense
import os
import torch
os.environ['FORCE_MEM_EFFICIENT_ATTN'] = "1"
import sys
from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
from deepfloyd_if.modules.t5 import T5Embedder
from deepfloyd_if.pipelines import dream, style_transfer, super_resolution, inpainting
import torch.nn.functional as F
import random
import torchvision.transforms as T
import numpy as np
import requests
from PIL import Image
import torch
import re
print("Loaded modules")
if_I = IFStageI('IF-I-XL-v1.0', device='cuda:0')
if_II = IFStageII('IF-II-L-v1.0', device='cuda:1')
if_III = StableStageIII('stable-diffusion-x4-upscaler', device='cuda:2')
t5 = T5Embedder(device='cuda:3')
prompt = 'lush garden'
count = 4
result = dream(
t5=t5, if_I=if_I, if_II=if_II, if_III=if_III,
prompt=[prompt]*count,
seed=42,
if_I_kwargs={
"guidance_scale": 7.0,
"sample_timestep_respacing": "smart100",
},
if_II_kwargs={
"guidance_scale": 4.0,
"sample_timestep_respacing": "smart50",
},
)
if_I.show(result['I'], size=3)
if_I.show(result['II'], size=6)
if_I.show(result['III'], size=14)
166 return module._hf_hook.post_forward(module, output)
File ~/.local/lib/python3.8/site-packages/transformers/models/t5/modeling_t5.py:530, in T5Attention.forward(self, hidden_states, mask, key_value_states, position_bias, past_key_value, layer_head_mask, query_length, use_cache, output_attentions)
525 value_states = project(
526 hidden_states, self.v, key_value_states, past_key_value[1] if past_key_value is not None else None
527 )
529 # compute scores
--> 530 scores = torch.matmul(
531 query_states, key_states.transpose(3, 2)
532 ) # equivalent of torch.einsum("bnqd,bnkd->bnqk", query_states, key_states), compatible with onnx op>9
534 if position_bias is None:
535 if not self.has_relative_attention_bias:
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmStridedBatchedExFix(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.