tothebeginning / pulid Goto Github PK
View Code? Open in Web Editor NEWOfficial code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
License: Apache License 2.0
Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
License: Apache License 2.0
I try to load loras with
pipeline.pipe.load_lora_weights("/kaggle/input/lorass/acuarelac1400.safetensors")
I don't know if it is the correct way, it would be helpful if you told me how to load loras
but i get
---------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[175], line 23
21 seed_everything(seed)
22 #out=pipeline.inference(prompt, init_image, mask_image , 0.8, (1, H, W), neg_prompt, id_embeddings, id_scale, scale, steps )
---> 23 out=pipeline.pipe(prompt=prompt,
24 image=init_image,
25 mask_image=mask_image,
26 strength=0.8,
27 negative_prompt=neg_prompt,
28 num_images_per_prompt=1,
29 height=H,
30 width=W,
31 num_inference_steps=steps,
32 guidance_scale= scale,
33 cross_attention_kwargs={ 'id_embedding': id_embeddings, 'id_scale': id_scale},)
35 out[0]
File /opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)
File /opt/conda/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_inpaint.py:1707, in StableDiffusionXLInpaintPipeline.__call__(self, prompt, prompt_2, image, mask_image, masked_image_latents, height, width, strength, num_inference_steps, timesteps, denoising_start, denoising_end, guidance_scale, negative_prompt, negative_prompt_2, num_images_per_prompt, eta, generator, latents, prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds, ip_adapter_image, output_type, return_dict, cross_attention_kwargs, guidance_rescale, original_size, crops_coords_top_left, target_size, negative_original_size, negative_crops_coords_top_left, negative_target_size, aesthetic_score, negative_aesthetic_score, clip_skip, callback_on_step_end, callback_on_step_end_tensor_inputs, **kwargs)
1705 if ip_adapter_image is not None:
1706 added_cond_kwargs["image_embeds"] = image_embeds
-> 1707 noise_pred = self.unet(
1708 latent_model_input,
1709 t,
1710 encoder_hidden_states=prompt_embeds,
1711 timestep_cond=timestep_cond,
1712 cross_attention_kwargs=self.cross_attention_kwargs,
1713 added_cond_kwargs=added_cond_kwargs,
1714 return_dict=False,
1715 )[0]
1717 # perform guidance
1718 if self.do_classifier_free_guidance:
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/conda/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py:1112, in UNet2DConditionModel.forward(self, sample, timestep, encoder_hidden_states, class_labels, timestep_cond, attention_mask, cross_attention_kwargs, added_cond_kwargs, down_block_additional_residuals, mid_block_additional_residual, down_intrablock_additional_residuals, encoder_attention_mask, return_dict)
1109 if is_adapter and len(down_intrablock_additional_residuals) > 0:
1110 additional_residuals["additional_residuals"] = down_intrablock_additional_residuals.pop(0)
-> 1112 sample, res_samples = downsample_block(
1113 hidden_states=sample,
1114 temb=emb,
1115 encoder_hidden_states=encoder_hidden_states,
1116 attention_mask=attention_mask,
1117 cross_attention_kwargs=cross_attention_kwargs,
1118 encoder_attention_mask=encoder_attention_mask,
1119 **additional_residuals,
1120 )
1121 else:
1122 sample, res_samples = downsample_block(hidden_states=sample, temb=emb, scale=lora_scale)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/conda/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py:1160, in CrossAttnDownBlock2D.forward(self, hidden_states, temb, encoder_hidden_states, attention_mask, cross_attention_kwargs, encoder_attention_mask, additional_residuals)
1158 else:
1159 hidden_states = resnet(hidden_states, temb, scale=lora_scale)
-> 1160 hidden_states = attn(
1161 hidden_states,
1162 encoder_hidden_states=encoder_hidden_states,
1163 cross_attention_kwargs=cross_attention_kwargs,
1164 attention_mask=attention_mask,
1165 encoder_attention_mask=encoder_attention_mask,
1166 return_dict=False,
1167 )[0]
1169 # apply additional residuals to the output of the last pair of resnet and attention blocks
1170 if i == len(blocks) - 1 and additional_residuals is not None:
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/conda/lib/python3.10/site-packages/diffusers/models/transformer_2d.py:392, in Transformer2DModel.forward(self, hidden_states, encoder_hidden_states, timestep, added_cond_kwargs, class_labels, cross_attention_kwargs, attention_mask, encoder_attention_mask, return_dict)
380 hidden_states = torch.utils.checkpoint.checkpoint(
381 create_custom_forward(block),
382 hidden_states,
(...)
389 **ckpt_kwargs,
390 )
391 else:
--> 392 hidden_states = block(
393 hidden_states,
394 attention_mask=attention_mask,
395 encoder_hidden_states=encoder_hidden_states,
396 encoder_attention_mask=encoder_attention_mask,
397 timestep=timestep,
398 cross_attention_kwargs=cross_attention_kwargs,
399 class_labels=class_labels,
400 )
402 # 3. Output
403 if self.is_input_continuous:
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/conda/lib/python3.10/site-packages/diffusers/models/attention.py:366, in BasicTransformerBlock.forward(self, hidden_states, attention_mask, encoder_hidden_states, encoder_attention_mask, timestep, cross_attention_kwargs, class_labels, added_cond_kwargs)
363 if self.pos_embed is not None and self.use_ada_layer_norm_single is False:
364 norm_hidden_states = self.pos_embed(norm_hidden_states)
--> 366 attn_output = self.attn2(
367 norm_hidden_states,
368 encoder_hidden_states=encoder_hidden_states,
369 attention_mask=encoder_attention_mask,
370 **cross_attention_kwargs,
371 )
372 hidden_states = attn_output + hidden_states
374 # 4. Feed-forward
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/conda/lib/python3.10/site-packages/diffusers/models/attention_processor.py:527, in Attention.forward(self, hidden_states, encoder_hidden_states, attention_mask, **cross_attention_kwargs)
508 r"""
509 The forward method of the `Attention` class.
510
(...)
522 `torch.Tensor`: The output of the attention layer.
523 """
524 # The `Attention` class can call different attention processors / attention functions
525 # here we simply pass along all tensors to the selected processor class
526 # For standard processors that are defined here, `**cross_attention_kwargs` is empty
--> 527 return self.processor(
528 self,
529 hidden_states,
530 encoder_hidden_states=encoder_hidden_states,
531 attention_mask=attention_mask,
532 **cross_attention_kwargs,
533 )
File /kaggle/working/PuLID/pulid/attention_processor.py:365, in IDAttnProcessor2_0.__call__(self, attn, hidden_states, encoder_hidden_states, attention_mask, temb, id_embedding, id_scale)
359 else:
360 zero_tensor = torch.zeros(
361 (id_embedding.size(0), NUM_ZERO, id_embedding.size(-1)),
362 dtype=id_embedding.dtype,
363 device=id_embedding.device,
364 )
--> 365 id_key = self.id_to_k(torch.cat((id_embedding, zero_tensor), dim=1)).to(query.dtype)
366 id_value = self.id_to_v(torch.cat((id_embedding, zero_tensor), dim=1)).to(query.dtype)
368 id_key = id_key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
RuntimeError: expected scalar type Float but found Half
I don't know if it is the correct way, it would be helpful if you told me how to load loras
Here was the error when using powershell, creating venv, following instruction and installing the requirements.txt, and running python app.py
C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\torchvision\transforms\functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
warnings.warn(
Please 'pip install xformers'
Please 'pip install apex'
Please 'pip install xformers'
C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\diffusers\configuration_utils.py:244: FutureWarning: It is deprecated to pass a pretrained model name or path to `from_config`.If you were trying to load a model, please use <class 'diffusers.models.unet_2d_condition.UNet2DConditionModel'>.load_config(...) followed by <class 'diffusers.models.unet_2d_condition.UNet2DConditionModel'>.from_config(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False)
Traceback (most recent call last):
File "C:\users\newpc\downloads\pullid\pulid\app.py", line 11, in <module>
pipeline = PuLIDPipeline()
File "C:\users\newpc\downloads\pullid\pulid\pulid\pipeline.py", line 42, in __init__
unet = UNet2DConditionModel.from_config(sdxl_base_repo, subfolder='unet').to(self.device, torch.float16)
File "C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\torch\nn\modules\module.py", line 1145, in to
return self._apply(convert)
File "C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\torch\nn\modules\module.py", line 820, in _apply
param_applied = fn(param)
File "C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\torch\nn\modules\module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\torch\cuda\__init__.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
(venv) PS C:\users\newpc\downloads\pullid\pulid>
I was able to resolve with:
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
Leaving this note just in case anyone else also runs into this issue. It's downloading now.. but still mentions the diffusers.models.unet_2d_condition.UNet2dconditionModel message.
C:\users\newpc\downloads\pullid\pulid\venv\lib\site-packages\diffusers\configuration_utils.py:244: FutureWarning: It is deprecated to pass a pretrained model name or path to `from_config`.If you were trying to load a model, please use <class 'diffusers.models.unet_2d_condition.UNet2DConditionModel'>.load_config(...) followed by <class 'diffusers.models.unet_2d_condition.UNet2DConditionModel'>.from_config(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False)
sdxl_lightning_4step_unet.safetensors: 8%|██▋ | 388M/5.14G [00:31<06:11, 12.8MB/s]
Hello,
Nice project but as you are utilizing antelopev2 as the face recognition model, it's important to note that your entire project is currently restricted for pure research purposes and cannot be utilized in commercial environments. To align with this restriction, I recommend placing similar information at the bottom of your page, akin to FaceID's notice found here: https://huggingface.co/h94/IP-Adapter-FaceID
For instance:
'Non-commercial Use: As InsightFace pretrained models are available solely for non-commercial research purposes, IP-Adapter-FaceID models are released exclusively for research purposes and are not intended for commercial use.'
While this restriction is in place, have you considered exploring alternative face recognition models that may not carry such limitations?
Thank you!
Thank you for the open source code for learning. I have a question to ask. I found that the consistency of the faces generated in close range (when the characters are close to the camera) is very high, but in long range (when the characters are far from the camera), the faces are prone to collapse
Will the training scripts be released at any point?
Whats the perfect image size for the input image ? How much environment ?
Very good face consistency but need comfyui. Excellent work!
please add huggingface demo
I have been testing PuLID in A1111 following this guide
Mikubill/sd-webui-controlnet#2838
But the results are very different from your demo, they are very low quality and low fidelity.
I trsted with the same lightning model, seme seed, same steps same control net strength.
As you can see in the second generation the image is very soft, it looks more like a painted picture, what can cause this issue ?
I'm getting this same texture with every model and every sampling method and scheduler I tried, so it must be something in the implementation.
including the train from scratch (tfs) and fine-tuning (ft), as well as the training dataset? Looking forward to hearing about your plans and reply.
Your efforts and contributions to the open-source community are greatly appreciated.
I notice you use for when num_sample > 1, I try to set num_images_per_prompt > 1, eg 2 or 4, the result is bad, ID similarity has been greatly weakened
Using PuLID in combination with FreeU_V2 gives garbled faces with white/pink blotches.
Edit: Problem wasn't FreeU but AYS - the default 10 steps seem too low so cranking up the step count solved the issue.
I got a prompt please 'pip install apex' on the console screen, but this PuLID runs fine without this on ComfyUI. I then proceed to install this Apex (Im using Windows), via python setup.py install after git clone this Apex. The previous warning about installing Apex is gone but PuLID cannot run after this and specifically got this warning: [No module named 'fused_layer_norm_cuda'] on my ComfyUI console. The Pytorch is 2.1.2 + CU 11.8.
I rolled back by uninstalling Apex, and PuLID works again. Im just wondering if a proper Apex actually speeds up the process? If yes, which version shall I install and how shall I do? Im not that tech-savvy and am just an average SD user. Please help... Thanks in advance.
what's the relationship between the Conventional Diffusion branch and the Lightning T2I branch? Do they share the same UNet weights but with different sampling algorithms?
Hello, when you calculated the layout loss and layout-sem loss, which cross-attention layers are the QKV features from? Do you use the features from all the cross-attention layers? I'm looking forward to your reply, thanks!
Thanks for your work, result ganggangdi !
I'm doing research related on this right now, can you please upload the stage2(maximum ID sim) model?
Anyone help me?
(pulid1) D:\PuLID>python app.py
C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\transformers\utils\generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree.register_pytree_node(
WARNING:xformers:A matching Triton is not available, some optimizations will not be enabled
Traceback (most recent call last):
File "C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\xformers_init.py", line 55, in _is_triton_available
from xformers.triton.softmax import softmax as triton_softmax # noqa
File "C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\xformers\triton\softmax.py", line 11, in
import triton
ModuleNotFoundError: No module named 'triton'
Please 'pip install apex'
C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\diffusers\configuration_utils.py:245: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config
.If you were trying to load a model, please use <class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'>.load_config(...) followed by <class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'>.from_config(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False)
Loading pipeline components...: 100%|████████████████████████████████████████████████████| 7/7 [00:00<00:00, 11.20it/s]
C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
C:\Users\admin\anaconda3\envs\pulid1\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None
for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=None
.
warnings.warn(msg)
INFO:root:Loaded EVA02-CLIP-L-14-336 model config.
INFO:root:Shape of rope freq: torch.Size([576, 64])
INFO:root:Loading pretrained EVA02-CLIP-L-14-336 weights (eva_clip).
INFO:root:incompatible_keys.missing_keys: ['visual.rope.freqs_cos', 'visual.rope.freqs_sin', 'visual.blocks.0.attn.rope.freqs_cos', 'visual.blocks.0.attn.rope.freqs_sin', 'visual.blocks.1.attn.rope.freqs_cos', 'visual.blocks.1.attn.rope.freqs_sin', 'visual.blocks.2.attn.rope.freqs_cos', 'visual.blocks.2.attn.rope.freqs_sin', 'visual.blocks.3.attn.rope.freqs_cos', 'visual.blocks.3.attn.rope.freqs_sin', 'visual.blocks.4.attn.rope.freqs_cos', 'visual.blocks.4.attn.rope.freqs_sin', 'visual.blocks.5.attn.rope.freqs_cos', 'visual.blocks.5.attn.rope.freqs_sin', 'visual.blocks.6.attn.rope.freqs_cos', 'visual.blocks.6.attn.rope.freqs_sin', 'visual.blocks.7.attn.rope.freqs_cos', 'visual.blocks.7.attn.rope.freqs_sin', 'visual.blocks.8.attn.rope.freqs_cos', 'visual.blocks.8.attn.rope.freqs_sin', 'visual.blocks.9.attn.rope.freqs_cos', 'visual.blocks.9.attn.rope.freqs_sin', 'visual.blocks.10.attn.rope.freqs_cos', 'visual.blocks.10.attn.rope.freqs_sin', 'visual.blocks.11.attn.rope.freqs_cos', 'visual.blocks.11.attn.rope.freqs_sin', 'visual.blocks.12.attn.rope.freqs_cos', 'visual.blocks.12.attn.rope.freqs_sin', 'visual.blocks.13.attn.rope.freqs_cos', 'visual.blocks.13.attn.rope.freqs_sin', 'visual.blocks.14.attn.rope.freqs_cos', 'visual.blocks.14.attn.rope.freqs_sin', 'visual.blocks.15.attn.rope.freqs_cos', 'visual.blocks.15.attn.rope.freqs_sin', 'visual.blocks.16.attn.rope.freqs_cos', 'visual.blocks.16.attn.rope.freqs_sin', 'visual.blocks.17.attn.rope.freqs_cos', 'visual.blocks.17.attn.rope.freqs_sin', 'visual.blocks.18.attn.rope.freqs_cos', 'visual.blocks.18.attn.rope.freqs_sin', 'visual.blocks.19.attn.rope.freqs_cos', 'visual.blocks.19.attn.rope.freqs_sin', 'visual.blocks.20.attn.rope.freqs_cos', 'visual.blocks.20.attn.rope.freqs_sin', 'visual.blocks.21.attn.rope.freqs_cos', 'visual.blocks.21.attn.rope.freqs_sin', 'visual.blocks.22.attn.rope.freqs_cos', 'visual.blocks.22.attn.rope.freqs_sin', 'visual.blocks.23.attn.rope.freqs_cos', 'visual.blocks.23.attn.rope.freqs_sin']
Fetching 6 files: 100%|████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 2575.30it/s]
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: .\models\antelopev2\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: .\models\antelopev2\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: .\models\antelopev2\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: .\models\antelopev2\glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
find model: .\models\antelopev2\scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1'}, 'CPUExecutionProvider': {}}
loading from id_adapter
loading from id_adapter_attn_layers
Running on local URL: http://0.0.0.0:7865
INFO:httpx:HTTP Request: GET https://checkip.amazonaws.com/ "HTTP/1.1 200 "
INFO:httpx:HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:7865/startup-events "HTTP/1.1 502 Bad Gateway"
INFO:httpx:HTTP Request: HEAD http://localhost:7865/ "HTTP/1.1 502 Bad Gateway"
INFO:httpx:HTTP Request: HEAD http://localhost:7865/ "HTTP/1.1 502 Bad Gateway"
INFO:httpx:HTTP Request: HEAD http://localhost:7865/ "HTTP/1.1 502 Bad Gateway"
INFO:httpx:HTTP Request: HEAD http://localhost:7865/ "HTTP/1.1 502 Bad Gateway"
INFO:httpx:HTTP Request: HEAD http://localhost:7865/ "HTTP/1.1 502 Bad Gateway"
INFO:httpx:HTTP Request: GET https://api.gradio.app/v2/tunnel-request "HTTP/1.1 200 OK"
Running on public URL: https://b1de752db43ff88cfe.gradio.live
This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run gradio deploy
from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Hello, I am discovering your service.
Concerning the photos that we send, are they stored on a server or everything is done locally with the GPU of the phone or computer ?
This is my errors:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.00 MiB (GPU 0; 12.00 GiB total capacity; 10.89 GiB already allocated; 0 bytes free; 11.31 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
PLatform:WIN10 / 3060 12G GPU
When I change Num samples to 1,There's error too. What should I Change In code? Thanks!
I wanted to load a custom stable diffusion model, and also to load custom lora weight instead of loading SDXL Lightning.
Is there a method, or a planned method, to load custom pipeline or/and lora weight into the PuLIDPipeline method.
I did read the issues #7 and implemented a custom model inside pipeline.py
but I would be preferable to load it on your own script, those to change easily the model.
hey all, great project here. Will this will work OpenPose? Thanks!
Hello, thanks for your incredible work!
In the 'Accurate ID Loss' section in the bottom right corner of Figure 2 of the paper, there are two generated images both denoted as 'predict x_0'. Are both of these images produced by the Lighting T2I? I guess they represent T2I w/ ID and T2I w/o ID, respectively. However, upon closer inspection, it appears that the IDs of both images are well-preserved, which contradicts my speculation. What are these two images' actual meanings and why do you connect them with a vertical line?
你好,考虑在playgroundai 2.5上面做一个新版吗
I understand you are using arcface, but only to calculate the loss during training, so I am not sure what the license of the checkpoints/model weights are?
requirements.txt
ERROR: Could not find a version that satisfies the requirement onnxruntime-gpu (from versions: none)
ERROR: No matching distribution found for onnxruntime-gpu
app.py
AssertionError: Torch not compiled with CUDA enabled
We don't want a web user or COMFYUI face changer with all the cumbersome dependencies that need to be installed, we just want a ROPE tool that allows users to upload pictures and do a face changer
Hi, I'm a developer working on SD and its relevant pipelines. PulID is a great tool for maintaining fidelity when generating new images. However, I find the code difficult to integrate with the Diffusers pipeline.
I would appreciate it if you could make it part of the Diffusers library.
As title, I cannot find any detail about these inference trick in the paper. Especially for fidelity and extremely style, you use different settings.
Here is my understandings, not sure whether they are correct.
(1) For NUM_ZERO, you actually add some zero tokens to make it possible that the query discard ID information (maybe better to keep the background uncontaminated? But it is in an implicit manner.)
(2) For ORTHO or ORTHO_v2, you calculate the projection of ID_hidden_state to hidden_state, then orthogonal = id_hidden_states - projection
is to obtain more disentangled ID information. Is this a experimental finding?
I took the latest update Mikubill/sd-webui-controlnet#2891 and I am noticing if there are is a male and female in a picture then PuLID replaces every face with the one mentioned in ControlNet. Previously, the male/female faces were getting applied correctly. In other words, gender detection is messed up :P
Guide me with how can I choose a gender or ID of a person to apply PuLID. Thanks
Hello, first of all, thank you very much for your efforts. The PuLID v1.1 preview is scheduled to be released, and I was wondering if this pretrained model is simply a model trained using a different base model? Or are there any additional structural changes, such as changes in the model architecture?
Hello!
I really love the Replicate PuLID portrait generator what I am trying now. could you recommend some tutorial to find out how to use the settings to fine tune the output? like how I can mix 2 images, how I can maximise the code.
thanks!
Thank you for sharing. Do you have plans for the multi ID input and controllnet?
pipeline.py
class PuLIDPipeline:
def __init__(self, *args, **kwargs):
super().__init__()
self.device = 'cuda'
sdxl_base_repo = 'stabilityai/stable-diffusion-xl-base-1.0'
sdxl_lightning_repo = 'ByteDance/SDXL-Lightning'
self.sdxl_base_repo = sdxl_base_repo
Lid = CosSim(/phi(Cid), /phi(...))
I am trying to implement PuLID in A1111 here: Mikubill/sd-webui-controlnet#2838
However, I found that the vram management of facexlib and evaclip is just very broken. In total they allocate about 3GB of vram and if you move their model to cpu, only about 1GB is freed. Another finding is that if you use really large image and directly pass it to facexlib, each time facexlib runs, the vram will increase significantly depending on input image size. I think this is a strong signal of vram leak. If you use small input files, the effect is probably unnoticeable.
Maybe I should file an issue in facexlib repo, but anyway just let people know.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.