Git Product home page Git Product logo

stable-diffusion-webui-tensorrt's Introduction

TensorRT Extension for Stable Diffusion

This extension enables the best performance on NVIDIA RTX GPUs for Stable Diffusion with TensorRT. You need to install the extension and generate optimized engines before using the extension. Please follow the instructions below to set everything up. Supports Stable Diffusion 1.5,2.1, SDXL, SDXL Turbo, and LCM. For SDXL and SDXL Turbo, we recommend using a GPU with 12 GB or more VRAM for best performance due to its size and computational intensity.

Installation

Example instructions for Automatic1111:

  1. Start the webui.bat
  2. Select the Extensions tab and click on Install from URL
  3. Copy the link to this repository and paste it into URL for extension's git repository
  4. Click Install

How to use

  1. Click on the “Generate Default Engines” button. This step takes 2-10 minutes depending on your GPU. You can generate engines for other combinations.
  2. Go to Settings → User Interface → Quick Settings List, add sd_unet. Apply these settings, then reload the UI.
  3. Back in the main UI, select “Automatic” from the sd_unet dropdown menu at the top of the page if not already selected.
  4. You can now start generating images accelerated by TRT. If you need to create more Engines, go to the TensorRT tab.

Happy prompting!

LoRA

To use LoRA / LyCORIS checkpoints they first need to be converted to a TensorRT format. This can be done in the TensorRT extension in the Export LoRA tab.

  1. Select a LoRA checkpoint from the dropdown.
  2. Export. (This will not generate an engine but only convert the weights in ~20s)
  3. You can use the exported LoRAs as usual using the prompt embedding.

More Information

TensorRT uses optimized engines for specific resolutions and batch sizes. You can generate as many optimized engines as desired. Types:

  • The "Export Default Engines” selection adds support for resolutions between 512 x 512 and 768x768 for Stable Diffusion 1.5 and 2.1 with batch sizes 1 to 4. For SDXL, this selection generates an engine supporting a resolution of 1024 x 1024 with a batch size of 1.
  • Static engines support a single specific output resolution and batch size.
  • Dynamic engines support a range of resolutions and batch sizes, at a small cost in performance. Wider ranges will use more VRAM.
  • The first time generating an engine for a checkpoint will take longer. Additional engines generated for the same checkpoint will be much faster.

Each preset can be adjusted with the “Advanced Settings” option. More detailed instructions can be found here.

Common Issues/Limitations

HIRES FIX: If using the hires.fix option in Automatic1111 you must build engines that match both the starting and ending resolutions. For instance, if the initial size is 512 x 512 and hires.fix upscales to 1024 x 1024, you must generate a single dynamic engine that covers the whole range.

Resolution: When generating images, the resolution needs to be a multiple of 64. This applies to hires.fix as well, requiring the low and high-res to be divisible by 64.

Failing CMD arguments:

  • medvram and lowvram Have caused issues when compiling the engine.
  • api Has caused the model.json to not be updated. Resulting in SD Unets not appearing after compilation.
  • Failing installation or TensorRT tab not appearing in UI: This is most likely due to a failed install. To resolve this manually use this guide.

Requirements

Driver:

Linux: >= 450.80.02

  • Windows: >= 452.39

We always recommend keeping the driver up-to-date for system wide performance improvements.

stable-diffusion-webui-tensorrt's People

Contributors

contentis avatar w-e-w avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stable-diffusion-webui-tensorrt's Issues

Error when use with controlnet SoftEdge model, preprocessor softedge_hed

I've successfully build and run it with engine built for resolution 512x512.
But it doesn't work with controlnet yet, here is the error when I'm trying to use it with controlnet:

SD model: dreamshaper_8.safetensors [879db523c3]
Controlnet Model: control_v11p_sd15_softedge [a8575a2a]
Preprocessor: softedge_hed

2023-10-19 16:30:20,360 - ControlNet - INFO - Preview Resolution = 512 2023-10-19 16:30:26,684 - ControlNet - INFO - Loading model: control_v11p_sd15_softedge [a8575a2a] 2023-10-19 16:30:29,608 - ControlNet - INFO - Loaded state_dict from [F:\stable-diffusion-webui\stable-diffusion-webui-practice\models\ControlNet\control_v11p_sd15_softedge.pth] 2023-10-19 16:30:29,609 - ControlNet - INFO - controlnet_default_config 2023-10-19 16:30:31,759 - ControlNet - INFO - ControlNet model control_v11p_sd15_softedge [a8575a2a] loaded. 2023-10-19 16:30:31,854 - ControlNet - INFO - Loading preprocessor: hed 2023-10-19 16:30:31,855 - ControlNet - INFO - preprocessor resolution = 512 2023-10-19 16:30:32,212 - ControlNet - INFO - ControlNet Hooked - Time = 5.776000738143921 0%| | 0/40 [00:01<?, ?it/s] *** Error completing request *** Arguments: ('task(yext2jd8oma0ppw)', '(absurdres, masterpiece, best quality, high quality, highres, ultra-detailed),', 'nsfw, nude, nipple, penis, noise, low res, blur, burry, worst quality, lowres, hermaphrodite, cropped, not in the frame, additional faces, jpeg large artifacts, jpeg small artifacts, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs,', [], 40, 'Euler a', 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x00000272318080A0>, 0, False, '', 0.8, -1, False, -1, 0, 0, 0, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002723178C5E0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002723178E320>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002723178C370>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x0000027045C36E60>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002723178DAE0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002723178DA20>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002723178FBE0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x00000272317D8130>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x00000272317DB220>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002723178FA60>, 'from modules.processing import process_images\n\np.width = 768\np.height = 768\np.batch_size = 2\np.steps = 10\n\nreturn process_images(p)', 2, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, None, None, False, None, None, False, None, None, False, None, None, False, None, None, False, None, None, False, None, None, False, 50) {} Traceback (most recent call last): File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\call_queue.py", line 57, in f res = list(func(*args, **kwargs)) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\call_queue.py", line 36, in f res = func(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\txt2img.py", line 55, in txt2img processed = processing.process_images(p) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\processing.py", line 734, in process_images res = process_images_inner(p) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\processing.py", line 869, in process_images_inner samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\extensions\sd-webui-controlnet\scripts\hook.py", line 451, in process_sample return process.sample_before_CN_hack(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\processing.py", line 1145, in sample samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x)) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\sd_samplers_kdiffusion.py", line 235, in sample samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\sd_samplers_common.py", line 261, in launch_sampling return func() File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\sd_samplers_kdiffusion.py", line 235, in <lambda> samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\repositories\k-diffusion\k_diffusion\sampling.py", line 145, in sample_euler_ancestral denoised = model(x, sigmas[i] * s_in, **extra_args) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\sd_samplers_cfg_denoiser.py", line 169, in forward x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in)) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps return self.inner_model.apply_model(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\sd_hijack_utils.py", line 17, in <lambda> setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs)) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\modules\sd_hijack_utils.py", line 28, in __call__ return self.__orig_func(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model x_recon = self.model(x_noisy, t, **cond) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1335, in forward out = self.diffusion_model(x, t, context=cc) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\extensions\sd-webui-controlnet\scripts\hook.py", line 858, in forward_webui raise e File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\extensions\sd-webui-controlnet\scripts\hook.py", line 855, in forward_webui return forward(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\extensions\sd-webui-controlnet\scripts\hook.py", line 753, in forward emb = self.time_embed(t_emb) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward input = module(input) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\extensions-builtin\Lora\networks.py", line 472, in network_Linear_forward return originals.Linear_forward(self, input) File "F:\stable-diffusion-webui\stable-diffusion-webui-practice\venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

ERROR:root:Could not allocate bytes object!

ERROR:root:Could not allocate bytes object!
MemoryError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Funnystuff\sd.webui\webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 84, in export_onnx
torch.onnx.export(
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\torch\onnx\utils.py", line 506, in export
_export(
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\torch\onnx\utils.py", line 1587, in _export
) = graph._export_onnx( # type: ignore[attr-defined]
RuntimeError: Could not allocate bytes object!

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\gradio\blocks.py", line 1117, in call_function
prediction = await utils.async_iteration(iterator)
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\gradio\utils.py", line 350, in async_iteration
return await iterator.anext()
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\gradio\utils.py", line 343, in anext
return await anyio.to_thread.run_sync(
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\gradio\utils.py", line 326, in run_sync_iterator_async
return next(iterator)
File "C:\Funnystuff\sd.webui\system\python\lib\site-packages\gradio\utils.py", line 695, in gen_wrapper
yield from f(*args, **kwargs)
File "C:\Funnystuff\sd.webui\webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 154, in export_unet_to_trt
export_onnx(
File "C:\Funnystuff\sd.webui\webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 129, in export_onnx
exit()
File "_sitebuiltins.py", line 26, in call
SystemExit: None

Any fixes to the issue ?, i got this when i try to export the default engine

Consistent Error Code 3

I have a consistent Error Code 3 no matter what dimensions I put. Down below is a small bit from the cmd for the dimensions of a 768x768 image for realisticvision. I've only crated the 7 original TensorRT presets. Also every image no matter the dimension defaults to "Profile 0:". Thanks!

[I] Loading bytes from D:\stable-diffusion-webui\models\Unet-trt\realisticVisionV51_v51VAE_a0f13c83_cc86_sample=1x4x64x64+2x4x64x64+8x4x96x96-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
Profile 0:
sample = [(1, 4, 64, 64), (2, 4, 64, 64), (8, 4, 96, 96)]
timesteps = [(1,), (2,), (8,)]
encoder_hidden_states = [(1, 77, 768), (2, 77, 768), (8, 154, 768)]
latent = [(-1945844936), (-1945837470), (-1945839863)]

0%| | 0/25 [00:00<?, ?it/s]Loading TensorRT engine: D:\stable-diffusion-webui\models\Unet-trt\realisticVisionV51_v51VAE_a0f13c83_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
[I] Loading bytes from D:\stable-diffusion-webui\models\Unet-trt\realisticVisionV51_v51VAE_a0f13c83_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
Profile 0:
sample = [(1, 4, 96, 96), (2, 4, 128, 128), (8, 4, 128, 128)]
timesteps = [(1,), (2,), (8,)]
encoder_hidden_states = [(1, 77, 768), (2, 77, 768), (8, 154, 768)]
latent = [(-1945904758), (-1945904374), (-1945898097)]

[E] 3: [executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2045] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2045, condition: profileMaxDims.d[i] >= dimensions.d[i]. Supplied binding dimension [1,231,768] for bindings[2] exceed min ~ max range at index 1, maximum dimension in profile is 154, minimum dimension in profile is 77, but supplied dimension is 231.
)
[E] 3: [executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2045] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::nvinfer1::rt::ExecutionContext::validateInputBindings::2045, condition: profileMaxDims.d[i] >= dimensions.d[i]. Supplied binding dimension [1,231,768] for bindings[2] exceed min ~ max range at index 1, maximum dimension in profile is 154, minimum dimension in profile is 77, but supplied dimension is 231.
)
4%|████████ | 1/25 [00:03<01:31, 3.83s/it][

Multiple profiles in one checkpoint?

multi profile

I've noticed that using the Export Engine by changing options in one checkpoint creates multiple profiles, so if I specify multiple Optimal Width/Height, will the profile automatically change to create an image whenever the resolution changes?

Unable to generate engine on dev branch

Hi, just wondering if anyone else has had this error:

To create a public link, set share=True in launch().
Startup time: 12.2s (prepare environment: 7.4s, import torch: 2.0s, import gradio: 0.5s, setup paths: 0.4s, initialize shared: 0.2s, other imports: 0.4s, load scripts: 0.7s, create ui: 0.2s, gradio launch: 0.1s).
Creating model from config: E:\Apps\stable-diffusion-webui-dev\configs\v1-inference.yaml
Applying attention optimization: Doggettx... done.
Model loaded in 1.5s (load weights from disk: 0.4s, create model: 0.4s, apply weights to model: 0.6s).
{'sample': [(1, 4, 64, 64), (2, 4, 64, 64), (8, 4, 96, 96)], 'timesteps': [(1,), (2,), (8,)], 'encoder_hidden_states': [(1, 77, 768), (2, 77, 768), (8, 154, 768)]}
Building TensorRT engine for E:\Apps\stable-diffusion-webui-dev\models\Unet-onnx\bunny4_f5a202a7.onnx: E:\Apps\stable-diffusion-webui-dev\models\Unet-trt\bunny4_f5a202a7_cc89_sample=1x4x64x64+2x4x64x64+8x4x96x96-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[E] 2: [builder.cpp::nvinfer1::builder::createCaskKernelLibraryImpl::185] Error Code 2: Internal Error (Assertion validateCaskKLibDigest(sha256sum) failed. )
Traceback (most recent call last):
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\gradio\blocks.py", line 1117, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\gradio\utils.py", line 350, in async_iteration
return await iterator.anext()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\gradio\utils.py", line 343, in anext
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\gradio\utils.py", line 326, in run_sync_iterator_async return next(iterator)
^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\gradio\utils.py", line 695, in gen_wrapper
yield from f(*args, **kwargs)
File "E:\Apps\stable-diffusion-webui-dev\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 175, in export_unet_to_trt
ret = export_trt(
^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 146, in export_trt
ret = engine.build(
^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 356, in build
network = network_from_onnx_path(
^^^^^^^^^^^^^^^^^^^^^^^
File "", line 3, in network_from_onnx_path
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\backend\base\loader.py", line 40, in call
return self.call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\util\util.py", line 710, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\backend\trt\loader.py", line 223, in call_impl
builder, network, parser = super().call_impl()
^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\util\util.py", line 710, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\backend\trt\loader.py", line 138, in call_impl
builder, network = create_network(strongly_typed=self.strongly_typed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 3, in create_network
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\backend\base\loader.py", line 40, in call
return self.call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\util\util.py", line 710, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\Apps\stable-diffusion-webui-dev\venv\Lib\site-packages\polygraphy\backend\trt\loader.py", line 103, in call_impl
builder = trt.Builder(trt_util.get_trt_logger())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: pybind11::init(): factory function returned nullptr

Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0

I have 2 strange error trying to use this extension (RTX 4070) :

  1. After installation, when I start Auto1111 I get 3 times error like can't find entry point for ?destroyTensorDescriptorEx@ops@cudnn......

  2. When I click to Export I get this error : ERROR:root:Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

Thank you

Installing on Auto1111 (1.6.0) hangs forever, then WebUI doesn't start

When installing through the extensions tab via the URL as suggested, it gets stuck at "installing" for several minutes and the console never indicates that it's doing anything.

When installing manually via git clone in the extensions folder, it will install quickly without problem, but then WebUI won't launch and hangs at the "commit hash" step:

venv "E:\webui2\new\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4

Error loading script: trt.py, ModuleNotFoundError: No module named 'tensorrt_bindings'

Hello, i have fresh installed using installation method provided and the extension will install, but the tab for TensorRT will not show.

*** Error loading script: trt.py
Traceback (most recent call last):
File "E:\Stable diffusion\SD\webui\modules\scripts.py", line 382, in load_scripts
script_module = script_loading.load_module(scriptfile.path)
File "E:\Stable diffusion\SD\webui\modules\script_loading.py", line 10, in load_module
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "E:\Stable diffusion\SD\webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in
import ui_trt
File "E:\Stable diffusion\SD\webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 10, in
from exporter import export_onnx, export_trt
File "E:\Stable diffusion\SD\webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 10, in
from utilities import Engine
File "E:\Stable diffusion\SD\webui\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 32, in
import tensorrt as trt
File "E:\Stable diffusion\SD\webui\venv\lib\site-packages\tensorrt_init
.py", line 18, in
from tensorrt_bindings import *
ModuleNotFoundError: No module named 'tensorrt_bindings'

'AsyncRequest' object has no attribute '_json_response_data'

When I try to create a new engine at the end it shows this error:

*** API error: POST: http://127.0.0.1:7860/api/predict {'error': 'LocalProtocolError', 'detail': '', 'body': '', 'errors': "Can't send data when our state is ERROR"}
    Traceback (most recent call last):
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 162, in __call__
        await self.app(scope, receive, _send)
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 109, in __call__
        await response(scope, receive, send)
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 270, in __call__
        async with anyio.create_task_group() as task_group:
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 597, in __aexit__
        raise exceptions[0]
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 273, in wrap
        await func()
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 134, in stream_response
        return await super().stream_response(send)
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 255, in stream_response
        await send(
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 159, in _send
        await send(message)
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 490, in send
        output = self.conn.send(event=response)
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
        data_list = self.send_with_data_passthrough(event)
      File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 483, in send_with_data_passthrough
        raise LocalProtocolError("Can't send data when our state is ERROR")
    h11._util.LocalProtocolError: Can't send data when our state is ERROR

---
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\fastapi\applications.py", line 273, in __call__
    await super().__call__(scope, receive, send)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 184, in __call__
    raise exc
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 109, in __call__
    await response(scope, receive, send)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 270, in __call__
    async with anyio.create_task_group() as task_group:
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 597, in __aexit__
    raise exceptions[0]
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 273, in wrap
    await func()
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\base.py", line 134, in stream_response
    return await super().stream_response(send)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\responses.py", line 255, in stream_response
    await send(
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\starlette\middleware\errors.py", line 159, in _send
    await send(message)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 490, in send
    output = self.conn.send(event=response)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
    data_list = self.send_with_data_passthrough(event)
  File "C:\Users\AdziOo\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 483, in send_with_data_passthrough
    raise LocalProtocolError("Can't send data when our state is ERROR")
h11._util.LocalProtocolError: Can't send data when our state is ERROR
'AsyncRequest' object has no attribute '_json_response_data'

I created a new SD installation and it works (there is no this error) while on the old one there is this error all the time. Any idea why? Without addons there is also this error on the old SD (both SD same versions).

Minimum NVIDIA driver requirement (Linux) for TensorRT acceleration

Hi,

Which minimum driver version is required for TensorRT support on Linux? The README is quite terse on requirements. My setup is currently:

Driver Version: 535.113.01   CUDA Version: 12.2

While “exporting the engines” seems to work fine, and I'm seeing an obvious performance increase, it's unclear if updating the driver is actually required.

Thanks.

Deleting engine profiles?

What is the recommended way to delete engine profiles after they are created, since it seems you can't do it from the UI. Should you just delete the trt and onnx files in models/Unet-trt and models/Unet-onnx? What about the profiles in the model.json file?

[BUG] Input shape must be divisible by 64 in both dimensions. Hires Fix error.

100%|██████████████████████████████████████████████████████████████████████████████████| 36/36 [00:04<00:00,  7.89it/s]
  0%|                                                                                            | 0/8 [00:00<?, ?it/s]
*** Error completing request
*** Arguments: ('task(o6gp9seri8repp8)', 'RAW photo, Fujifilm XT3, lighthouse, <lora:epiCRealLife:1>', 'deformed, oversaturated, overcontrasted, asian, chinese, cartoon, cgi, render, illustration, painting, drawing, statue, anime, 2d,', [], 36, 'DPM++ 3M SDE Karras', 1, 1, 10, 512, 768, True, 0.5, 1.65, '4xLSDIR', 8, 0, 0, 'Use same checkpoint', 'DPM++ 2M Karras', '', '', [], <gradio.routes.Request object at 0x0000000F8005F5B0>, 0, False, '', 0.8, 3952135377, False, -1, 0, 0, 0, 0, False, 1, False, False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, False, 1448, 0, 8, 'DF2K', '4x-UniScaleV2_Sharp', 0.3, 0.1, '', '', 2, 'Noise sync (sharp)', 0.3, 0.05, 0, 'DPM++ 3M SDE', False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 2048, 128, True, True, True, False, False, 7, 100, 'Constant', 0, 'Constant', 0, 4, True, 'MEAN', 'AD', 1, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x00000000670B9CC0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x00000000670B8D00>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x00000000670B9BA0>, False, '', 0.5, True, False, '', 'Lerp', False, [], [], False, 0, 0.8, 0, 0.8, 0.5, False, False, 0.5, 8192, -1.0, None, '', None, True, False, False, False, False, False, 0, 0, '0', 0, False, True, 0, 'Portrait of a [gender]', 'blurry', 20, ['DPM++ 2M Karras'], '', None, 1, 1, None, False, False, False, 1, 0, 'Portrait of a [gender]', 'blurry', 20, ['DPM++ 2M Karras'], '', None, '', None, True, False, False, False, False, False, 0, 0, '0', 0, False, True, 0, 'Portrait of a [gender]', 'blurry', 20, ['DPM++ 2M Karras'], '', None, 1, 1, None, False, False, False, 1, 0, 'Portrait of a [gender]', 'blurry', 20, ['DPM++ 2M Karras'], '', None, '', None, True, False, False, False, False, False, 0, 0, '0', 0, False, True, 0, 'Portrait of a [gender]', 'blurry', 20, ['DPM++ 2M Karras'], '', None, 1, 1, None, False, False, False, 1, 0, 'Portrait of a [gender]', 'blurry', 20, ['DPM++ 2M Karras'], '', 'CodeFormer', 1, 1, None, 1, 1, ['After Upscaling/Before Restore Face'], 0, 'Portrait of a [gender]', 'blurry', 20, ['DPM++ 2M Karras'], '', True, 0, 1, 0, 1.2, 0.9, 0, 0.5, 0, 1, 1.4, 0.2, 0, 0.5, 0, 1, 1, 1, 0, 0.5, 0, 1, 0, None, False, '0', '0', 'inswapper_128.onnx', 'CodeFormer', 1, True, 'None', 1, 1, False, True, 1, 0, 0, False, 0.5, False, False, 0, 1, 1, 0, 0, 0, 0, False, 'Straight Abs.', 'Flat', False, 0.75, 1, False, False, 3, 0, False, False, 0, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, 50) {}
    Traceback (most recent call last):
      File "F:\WBC\automatic1111_dev\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "F:\WBC\automatic1111_dev\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\modules\txt2img.py", line 55, in txt2img
        processed = processing.process_images(p)
      File "F:\WBC\automatic1111_dev\modules\processing.py", line 734, in process_images
        res = process_images_inner(p)
      File "F:\WBC\automatic1111_dev\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
        return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
      File "F:\WBC\automatic1111_dev\modules\processing.py", line 869, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "F:\WBC\automatic1111_dev\modules\processing.py", line 1161, in sample
        return self.sample_hr_pass(samples, decoded_samples, seeds, subseeds, subseed_strength, prompts)
      File "F:\WBC\automatic1111_dev\modules\processing.py", line 1247, in sample_hr_pass
        samples = self.sampler.sample_img2img(self, samples, noise, self.hr_c, self.hr_uc, steps=self.hr_second_pass_steps or self.steps, image_conditioning=image_conditioning)
      File "F:\WBC\automatic1111_dev\modules\sd_samplers_kdiffusion.py", line 188, in sample_img2img
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "F:\WBC\automatic1111_dev\modules\sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "F:\WBC\automatic1111_dev\modules\sd_samplers_kdiffusion.py", line 188, in <lambda>
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "F:\WBC\automatic1111_dev\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "F:\WBC\automatic1111_dev\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\modules\sd_samplers_cfg_denoiser.py", line 169, in forward
        x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
      File "F:\WBC\automatic1111_dev\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "F:\WBC\automatic1111_dev\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\modules\sd_hijack_utils.py", line 17, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "F:\WBC\automatic1111_dev\modules\sd_hijack_utils.py", line 28, in __call__
        return self.__orig_func(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
        x_recon = self.model(x_noisy, t, **cond)
      File "F:\WBC\automatic1111_dev\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1335, in forward
        out = self.diffusion_model(x, t, context=cc)
      File "F:\WBC\automatic1111_dev\venv\lib\site-packages\torch\nn\modules\module.py", line 1538, in _call_impl
        result = forward_call(*args, **kwargs)
      File "F:\WBC\automatic1111_dev\modules\sd_unet.py", line 89, in UNetModel_forward
        return current_unet.forward(x, timesteps, context, *args, **kwargs)
      File "F:\WBC\automatic1111_dev\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 84, in forward
        raise ValueError(
    ValueError: Input shape must be divisible by 64 in both dimensions.

---

Error when using hires fix.

Error building engine

Building TensorRT engine for C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\models\Unet-onnx\realismEngineSDXL_v10_af771c3f.onnx: C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\models\Unet-trt\realismEngineSDXL_v10_af771c3f_cc86_sample=2x4x128x128-timesteps=2-encoder_hidden_states=2x77x2048-y=2x2816.trt
ERROR:asyncio:Exception in callback H11Protocol.timeout_keep_alive_handler()
handle: <TimerHandle when=2048093.89 H11Protocol.timeout_keep_alive_handler()>
Traceback (most recent call last):
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_state.py", line 249, in _fire_event_triggered_transitions
new_state = EVENT_TRIGGERED_TRANSITIONS[role][state][event_type]
KeyError: <class 'h11._events.ConnectionClosed'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "asyncio\events.py", line 80, in _run
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 363, in timeout_keep_alive_handler
self.conn.send(event)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_connection.py", line 468, in send
data_list = self.send_with_data_passthrough(event)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_connection.py", line 493, in send_with_data_passthrough
self._process_event(self.our_role, event)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_connection.py", line 242, in _process_event
self._cstate.process_event(role, type(event), server_switch_event)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_state.py", line 238, in process_event
self._fire_event_triggered_transitions(role, event_type)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_state.py", line 251, in _fire_event_triggered_transitions
raise LocalProtocolError(
h11._util.LocalProtocolError: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Loading tactic timing cache from C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\extensions\Stable-Diffusion-WebUI-TensorRT\timing_caches\timing_cache_win_cc86.cache
[E] 1: [myelinCache.cpp::nvinfer1::builder::MyelinAutotunerCache::deserialize::62] Error Code 1: Myelin (Myelin error from unknown graph)
[E] 4: The timing cache will not be used!
[I] Building engine with configuration:
Flags | [FP16, REFIT, TF32]
Engine Capability | EngineCapability.DEFAULT
Memory Pools | [WORKSPACE: 24575.50 MiB, TACTIC_DRAM: 24575.50 MiB]
Tactic Sources | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
Profiling Verbosity | ProfilingVerbosity.LAYER_NAMES_ONLY
Preview Features | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]
Building engine: 100%|##########| 6/6 [03:51<00:00, 38.63s/it]
[I] Finished engine building in 233.852 seconds
[E] 1: [myelinCache.cpp::nvinfer1::builder::MyelinAutotunerCache::deserialize::62] Error Code 1: Myelin (Myelin error from unknown graph)
[E] 4: The timing cache will not be used!
[I] Saving tactic timing cache to C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\extensions\Stable-Diffusion-WebUI-TensorRT\timing_caches\timing_cache_win_cc86.cache
[I] Saving engine to C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\models\Unet-trt\realismEngineSDXL_v10_af771c3f_cc86_sample=2x4x128x128-timesteps=2-encoder_hidden_states=2x77x2048-y=2x2816.trt
*** API error: POST: http://127.0.0.1:7860/api/predict {'error': 'LocalProtocolError', 'detail': '', 'body': '', 'errors': "Can't send data when our state is ERROR"}
Traceback (most recent call last):
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\base.py", line 109, in call
await response(scope, receive, send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\responses.py", line 270, in call
async with anyio.create_task_group() as task_group:
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\anyio_backends_asyncio.py", line 597, in aexit
raise exceptions[0]
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\responses.py", line 273, in wrap
await func()
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\base.py", line 134, in stream_response
return await super().stream_response(send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\responses.py", line 255, in stream_response
await send(
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\errors.py", line 159, in _send
'AsyncRequest' object has no attribute '_json_response_data'
await send(message)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 490, in send
output = self.conn.send(event=response)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_connection.py", line 468, in send
data_list = self.send_with_data_passthrough(event)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_connection.py", line 483, in send_with_data_passthrough
raise LocalProtocolError("Can't send data when our state is ERROR")
h11._util.LocalProtocolError: Can't send data when our state is ERROR


ERROR: Exception in ASGI application
Traceback (most recent call last):
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 408, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in call
return await self.app(scope, receive, send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\fastapi\applications.py", line 273, in call
await super().call(scope, receive, send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\base.py", line 109, in call
await response(scope, receive, send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\responses.py", line 270, in call
async with anyio.create_task_group() as task_group:
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\anyio_backends_asyncio.py", line 597, in aexit
raise exceptions[0]
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\responses.py", line 273, in wrap
await func()
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\base.py", line 134, in stream_response
return await super().stream_response(send)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\responses.py", line 255, in stream_response
await send(
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\starlette\middleware\errors.py", line 159, in _send
await send(message)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 490, in send
output = self.conn.send(event=response)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_connection.py", line 468, in send
data_list = self.send_with_data_passthrough(event)
File "C:\StabilityMatrix\Data\Packages\Stable Diffusion WebUI\venv\lib\site-packages\h11_connection.py", line 483, in send_with_data_passthrough
raise LocalProtocolError("Can't send data when our state is ERROR")
h11._util.LocalProtocolError: Can't send data when our state is ERROR

Then stalls, have to reboot UI

img2img broken when switching to another model

i figured how to make RT things work
but now for some reasons i can't use other ckpts in img2img tab only.
I can only use the converted RT model, otherwise i get this error:

 Traceback (most recent call last):
      File "F:\SD\webui\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "F:\SD\webui\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "F:\SD\webui\modules\img2img.py", line 208, in img2img
        processed = process_images(p)
      File "F:\SD\webui\modules\processing.py", line 732, in process_images
        res = process_images_inner(p)
      File "F:\SD\webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
        return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
      File "F:\SD\webui\modules\processing.py", line 867, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "F:\SD\webui\modules\processing.py", line 1528, in sample
        samples = self.sampler.sample_img2img(self, self.init_latent, x, conditioning, unconditional_conditioning, image_conditioning=self.image_conditioning)
      File "F:\SD\webui\modules\sd_samplers_kdiffusion.py", line 188, in sample_img2img
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "F:\SD\webui\modules\sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "F:\SD\webui\modules\sd_samplers_kdiffusion.py", line 188, in <lambda>
        samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "F:\SD\webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "F:\SD\webui\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "F:\SD\webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "F:\SD\webui\modules\sd_samplers_cfg_denoiser.py", line 201, in forward
        devices.test_for_nans(x_out, "unet")
      File "F:\SD\webui\modules\devices.py", line 136, in test_for_nans
        raise NansException(message)
    modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

4090 get CUDA out of memory

I success generation a SDXL engine (use webui dev )
when I try to generation a picture,an error occurred when the inference progress bar reached around 40% (10/25 steps). The error message is as follows.
(my system sd_xl_base_1.0_0.9vae_fp16fix,DPM++ 2M SDE Karras,25 steps,1024×1024,win10,drvier version 531.61)
By the way, the SD1.5 engine can be used normally.

To create a public link, set share=True in launch().
Startup time: 20.2s (prepare environment: 2.5s, import torch: 4.5s, import gradio: 0.4s, setup paths: 0.5s, initialize shared: 0.2s, other imports: 0.3s, load scripts: 2.4s, create ui: 5.1s, gradio launch: 4.2s).
Activating unet: [TRT] sd_xl_base_1.0_0.9vae_fp16fix
Loading TensorRT engine: D:\kkkkk\release\SD_webui_with_aki_launcher_dev\models\Unet-trt\sd_xl_base_1.0_0.9vae_fp16fix_75679766_cc89_sample=2x4x128x128-timesteps=2-encoder_hidden_states=2x77x2048-y=2x2816.trt
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Loading bytes from D:\kkkkk\release\SD_webui_with_aki_launcher_dev\models\Unet-trt\sd_xl_base_1.0_0.9vae_fp16fix_75679766_cc89_sample=2x4x128x128-timesteps=2-encoder_hidden_states=2x77x2048-y=2x2816.trt
Profile 0:
sample = [(2, 4, 128, 128), (2, 4, 128, 128), (2, 4, 128, 128)]
timesteps = [(2,), (2,), (2,)]
encoder_hidden_states = [(2, 77, 2048), (2, 77, 2048), (2, 77, 2048)]
y = [(2, 2816), (2, 2816), (2, 2816)]
latent = [(2, 4, 128, 128), (2, 4, 128, 128), (2, 4, 128, 128)]

Loading TensorRT engine: D:\kkkkk\release\SD_webui_with_aki_launcher_dev\models\Unet-trt\sd_xl_base_1.0_0.9vae_fp16fix_75679766_cc89_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x2048+2x77x2048+8x154x2048-y=1x2816+2x2816+8x2816.trt
[I] Loading bytes from D:\kkkkk\release\SD_webui_with_aki_launcher_dev\models\Unet-trt\sd_xl_base_1.0_0.9vae_fp16fix_75679766_cc89_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x2048+2x77x2048+8x154x2048-y=1x2816+2x2816+8x2816.trt
Profile 0:
sample = [(1, 4, 96, 96), (2, 4, 128, 128), (8, 4, 128, 128)]
timesteps = [(1,), (2,), (8,)]
encoder_hidden_states = [(1, 77, 2048), (2, 77, 2048), (8, 154, 2048)]
y = [(1, 2816), (2, 2816), (8, 2816)]
latent = [(-1946005248), (-1946001664), (-1945988864)]

*** Error completing request
*** Arguments: ('task(nfpm4ighzggugtr)', 'a girl', '', [], 25, 'DPM++ 2M SDE Karras', 1, 1, 7, 1024, 1024, False, 0.7, 2, 'R-ESRGAN 4x+ Anime6B', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x000002250CB56E60>, 0, False, 'sd_xl_refiner_1.0_0.9vae.safetensors [8d0ce6c016]', 0.7, -1, False, -1, 0, 0, 0, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 1, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 3072, 192, True, True, True, False, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002250CBBBE50>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002250CBB8C70>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002250A6F5AB0>, '无(None)', False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, 50) {}
Traceback (most recent call last):
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\call_queue.py", line 57, in f
res = list(func(*args, **kwargs))
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\call_queue.py", line 36, in f
res = func(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\txt2img.py", line 55, in txt2img
processed = processing.process_images(p)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\processing.py", line 734, in process_images
res = process_images_inner(p)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\processing.py", line 869, in process_images_inner
samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\processing.py", line 1145, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_samplers_kdiffusion.py", line 235, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_samplers_common.py", line 261, in launch_sampling
return func()
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_samplers_kdiffusion.py", line 235, in
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\repositories\k-diffusion\k_diffusion\sampling.py", line 626, in sample_dpmpp_2m_sde
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_samplers_cfg_denoiser.py", line 169, in forward
x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_models_xl.py", line 37, in apply_model
return self.model(x, t, cond)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_hijack_utils.py", line 17, in
setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_hijack_utils.py", line 28, in call
return self.__orig_func(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\repositories\generative-models\sgm\modules\diffusionmodules\wrappers.py", line 28, in forward
return self.diffusion_model(
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\python\lib\site-packages\torch\nn\modules\module.py", line 1568, in _call_impl
result = forward_call(*args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\modules\sd_unet.py", line 89, in UNetModel_forward
return current_unet.forward(x, timesteps, context, *args, **kwargs)
File "D:\kkkkk\release\SD_webui_with_aki_launcher_dev\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 91, in forward
tmp = torch.empty(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 12.46 GiB. GPU 0 has a total capacty of 23.99 GiB of which 10.77 GiB is free. Of the allocated memory 1.72 GiB is allocated by PyTorch, and 1.06 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


RuntimeError: dictionary changed size during iteration

This happens when you have model in config file but later deleted model from Unet-trt folder

*** Error loading script: trt.py
    Traceback (most recent call last):
      File "G:\auto_quick\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "G:\auto_quick\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in <module>
        import ui_trt
      File "G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 16, in <module>
        from model_manager import modelmanager, cc_major, TRT_MODEL_DIR
      File "G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 219, in <module>
        modelmanager = ModelManager()
      File "G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 35, in __init__
        self.update()
      File "G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 67, in update
        for base_model, models in base_models.items():
    RuntimeError: dictionary changed size during iteration

---
Exception ignored in: <function ModelManager.__del__ at 0x000001799BA029E0>
Traceback (most recent call last):
  File "G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 87, in __del__
    self.update()
  File "G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 67, in update
    for base_model, models in base_models.items():
RuntimeError: dictionary changed size during iteration

what is the problem of this ,(Error executing callback ui_tabs_callback)?????????

Loading weights [e1441589a6] from Y:\Sd-UI-1.6\models\Stable-diffusion\v1-5-pruned.ckpt
*** Error executing callback ui_tabs_callback for Y:\Sd-UI-1.6\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py
Traceback (most recent call last):
File "Y:\Sd-UI-1.6\modules\script_callbacks.py", line 166, in ui_tabs_callback
res += c.callback() or []
File "Y:\Sd-UI-1.6\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 789, in on_ui_tabs
choices=get_valid_lora_checkpoints(),
File "Y:\Sd-UI-1.6\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 573, in get_valid_lora_checkpoints
available_lora_models = get_lora_checkpoints()
File "Y:\Sd-UI-1.6\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 561, in get_lora_checkpoints
metadata = sd_models.read_metadata_from_safetensors(filename)
File "Y:\Sd-UI-1.6\modules\sd_models.py", line 271, in read_metadata_from_safetensors
json_obj = json.loads(json_data)
File "Y:\Sd-UI-1.6\python\lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
File "Y:\Sd-UI-1.6\python\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "Y:\Sd-UI-1.6\python\lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 1472 (char 1471)

Help ! : "No valid profile found. Please go to the TensorRT tab and generate an engine with the necessary profile."

Hi !
I start by saying that I'm not a power user and I don't edit code and stuff like that, even if I used to be a software developer 20 years ago ;-) Now I'm only a 48 yo photographer :-D

I just installed TensorRT inside my A1111 and managed to get rid of initial popup errors deleting "cudnn" folder in \venv\Lib\site-packages as mentioned in another post by @Altrue ...
I can create engines now. I created two of them for cyberrealistic and epicrealism, but when I try to use them to generate an image I get the error in title.
I paste below the entire text from the console ...
Anyone can help ?

Thanks !
Paolo

*** Error completing request
*** Arguments: ('task(e1km8uykyuzr5oq)', 'woman', 'bw, worst quality, low quality, (deformed, distorted, disfigured:1.3), poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, (mutated hands and fingers:1.4), disconnected limbs, mutation, mutated, ugly, disgusting, blurry, amputation, cloned', [], 25, 'DPM++ 2M Karras', 1, 1, 7, 512, 512, True, 0.3, 2, '4x_NMKD-Superscale-SP_178000_G', 10, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x000002628D1E4910>, 0, False, '', 0.8, 4049516876, False, -1, 0, 0, 0, False, False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, False, 'prompt keyword', 'keyword1, keyword2', 'None', 'textual inversion first', 'None', '0.7', 'None', True, False, 1, False, False, False, 1.1, 1.5, 100, 0.7, False, False, True, False, False, 0, 'Gustavosta/MagicPrompt-Stable-Diffusion', '', <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002628D180A60>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002628D1819C0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x000002628D181780>, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, 5, 'all', 'all', 'all', '', '', '', '1', 'none', False, '', '', 'comma', '', True, '', '20', 'all', 'all', 'all', 'all', 0, '', None, None, False, None, None, False, None, None, False, 50) {}
Traceback (most recent call last):
File "F:\A1111\stable-diffusion-webui\modules\call_queue.py", line 57, in f
res = list(func(*args, **kwargs))
File "F:\A1111\stable-diffusion-webui\modules\call_queue.py", line 36, in f
res = func(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\modules\txt2img.py", line 55, in txt2img
processed = processing.process_images(p)
File "F:\A1111\stable-diffusion-webui\modules\processing.py", line 732, in process_images
res = process_images_inner(p)
File "F:\A1111\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "F:\A1111\stable-diffusion-webui\modules\processing.py", line 867, in process_images_inner
samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
File "F:\A1111\stable-diffusion-webui\modules\processing.py", line 1156, in sample
return self.sample_hr_pass(samples, decoded_samples, seeds, subseeds, subseed_strength, prompts)
File "F:\A1111\stable-diffusion-webui\modules\processing.py", line 1242, in sample_hr_pass
samples = self.sampler.sample_img2img(self, samples, noise, self.hr_c, self.hr_uc, steps=self.hr_second_pass_steps or self.steps, image_conditioning=image_conditioning)
File "F:\A1111\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 188, in sample_img2img
samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "F:\A1111\stable-diffusion-webui\modules\sd_samplers_common.py", line 261, in launch_sampling
return func()
File "F:\A1111\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 188, in
samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "F:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "F:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\modules\sd_samplers_cfg_denoiser.py", line 169, in forward
x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
File "F:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "F:\A1111\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in
setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
File "F:\A1111\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call
return self.__orig_func(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "F:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1335, in forward
out = self.diffusion_model(x, t, context=cc)
File "F:\A1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\A1111\stable-diffusion-webui\modules\sd_unet.py", line 89, in UNetModel_forward
return current_unet.forward(x, timesteps, context, *args, **kwargs)
File "F:\A1111\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 87, in forward
self.switch_engine(feed_dict)
File "F:\A1111\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 108, in switch_engine
raise ValueError(
ValueError: No valid profile found. Please go to the TensorRT tab and generate an engine with the necessary profile. Or use the default (torch) U-Net.

Error loading script: trt.py

After installing the TensorRT extension this appears before the ui launches. The TensorRT extension does appear in the extensions tab but it doesn't appear in the UI itself.

** Error loading script: trt.py
Traceback (most recent call last):
File "F:\stable-diffusion-webui\modules\scripts.py", line 382, in load_scripts
script_module = script_loading.load_module(scriptfile.path)
File "F:\stable-diffusion-webui\modules\script_loading.py", line 10, in load_module
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "F:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in
import ui_trt
File "F:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 16, in
from model_manager import modelmanager, cc_major, TRT_MODEL_DIR
File "F:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 219, in
modelmanager = ModelManager()
File "F:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 35, in init
self.update()
File "F:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 67, in update
for base_model, models in base_models.items():
RuntimeError: dictionary changed size during iteration


Exception ignored in: <function ModelManager.del at 0x0000021A6579CDC0>
Traceback (most recent call last):
File "F:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 87, in del
self.update()
File "F:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\model_manager.py", line 67, in update
for base_model, models in base_models.items():
RuntimeError: dictionary changed size during iteration

Error installing in Automatic1111

Here is the error in the console:

Error running install.py for extension D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT.
*** Command: "d:\repos\stable-diffusion-webui\venv\Scripts\python.exe" "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py"
*** Error code: 1
*** stdout: Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
*** Collecting tensorrt==9.0.1.post11.dev4
***   Downloading https://pypi.nvidia.com/tensorrt/tensorrt-9.0.1.post11.dev4.tar.gz (18 kB)
***   Preparing metadata (setup.py): started
***   Preparing metadata (setup.py): finished with status 'done'
*** Building wheels for collected packages: tensorrt
***   Building wheel for tensorrt (setup.py): started
***   Building wheel for tensorrt (setup.py): still running...
***   Building wheel for tensorrt (setup.py): finished with status 'done'
***   Created wheel for tensorrt: filename=tensorrt-9.0.1.post11.dev4-py2.py3-none-any.whl size=17618 sha256=e059e2b3b7dd7ecf4c805ab6f2b4589ddb43b0959bfa66178fa0d01559ba1ef8
***   Stored in directory: c:\users\X\appdata\local\pip\cache\wheels\d1\6d\71\f679d0d23a60523f9a05445e269bfd0bcd1c5272097fa931df
*** Successfully built tensorrt
*** Installing collected packages: tensorrt
*** Successfully installed tensorrt-9.0.1.post11.dev4
*** Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
*** Collecting polygraphy
***   Downloading polygraphy-0.49.0-py2.py3-none-any.whl (327 kB)
***      -------------------------------------- 327.9/327.9 kB 4.1 MB/s eta 0:00:00
*** Installing collected packages: polygraphy
*** Successfully installed polygraphy-0.49.0
*** Collecting protobuf==3.20.2
***   Downloading protobuf-3.20.2-cp310-cp310-win_amd64.whl (904 kB)
***      -------------------------------------- 904.0/904.0 kB 4.4 MB/s eta 0:00:00
*** Installing collected packages: protobuf
***   Attempting uninstall: protobuf
***     Found existing installation: protobuf 3.20.0
***     Uninstalling protobuf-3.20.0:
***       Successfully uninstalled protobuf-3.20.0
*** TensorRT is not installed! Installing...
*** Installing nvidia-cudnn-cu11
*** Installing tensorrt
*** removing nvidia-cudnn-cu11
*** Polygraphy is not installed! Installing...
*** Installing polygraphy
*** GS is not installed! Installing...
*** Installing protobuf
***
*** stderr: A matching Triton is not available, some optimizations will not be enabled.
*** Error caught was: No module named 'triton'
*** d:\repos\stable-diffusion-webui\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
***   rank_zero_deprecation(
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
*** ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'D:\\repos\\stable-diffusion-webui\\venv\\Lib\\site-packages\\google\\~rotobuf\\internal\\_api_implementation.cp310-win_amd64.pyd'
*** Check the permissions.
***
***
*** [notice] A new release of pip available: 22.2.1 -> 23.2.1
*** [notice] To update, run: d:\repos\stable-diffusion-webui\venv\Scripts\python.exe -m pip install --upgrade pip
*** Traceback (most recent call last):
***   File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py", line 30, in <module>***     install()
***   File "D:\repos\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\install.py", line 19, in install
***     launch.run_pip("install protobuf==3.20.2", "protobuf", live=True)
***   File "d:\repos\stable-diffusion-webui\modules\launch_utils.py", line 138, in run_pip
***     return run(f'"{python}" -m pip {command} --prefer-binary{index_url_line}', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}", live=live)
***   File "d:\repos\stable-diffusion-webui\modules\launch_utils.py", line 115, in run
***     raise RuntimeError("\n".join(error_bits))
*** RuntimeError: Couldn't install protobuf.
*** Command: "d:\repos\stable-diffusion-webui\venv\Scripts\python.exe" -m pip install protobuf==3.20.2 --prefer-binary
*** Error code: 1

And then when I restarted the webui, I got these popups:

Screenshot 2023-10-17 102416
Screenshot 2023-10-17 102706
Screenshot 2023-10-17 102950
Screenshot 2023-10-17 103000

What does that mean?

multi / specific gpu?

Hey! So it took me a while to get SDXL working. I could use 1.6.0 release of webUI to generate engines for both 1.5 and SDXL, but couldn't do it on dev branch. I copied models on dev instance and got it working finally. Now the last challenge I'm on is this:

RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

I use two 3090s in SLI, without tensorRT i have two instances running together using this in .bat:
set CUDA_VISIBLE_DEVICES= 0, 1
set COMMANDLINE_ARGS=--device-id 1

or --device-id 0 for other instance.

Any advice to make it work specifically on second gpu? I can get it working on dev branch without these parameters on "main" gpu 0

Freezes when exporting engine at "Timing Graph Nodes"

I've exported a couple engines already, but I tried exporting a static engine (1024x1024 preset), and it freezes every time when "Timing Graph Nodes" at 4% progress. The GPU looks like it is working, but the percentage is not progressing. Do you know what might be causing it to freeze?

No ''Available TensorRT Engine Profiles''

I have exported engine but I don't see any profile under "TensorRT Engine Profiles" selection and it gives me this error when export an engine.

 Building engine:  50%|#####     | 3/6 [00:00<00:00, 1[E] 1: [defaultAllocator.cpp::nvinfer1::internal::DefaultAllocator::allocate::20] Error Code 1: Cuda Runtime (out of memory)        | 0/5 [00:00<?, ?it/s]
[W] Requested amount of GPU memory (34360788477 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[E] 9: Skipping tactic0x0000000000000000 due to exception [::0] autotuning: User allocator error allocating 34360788477-byte buffer
[E] 1: [defaultAllocator.cpp::nvinfer1::internal::DefaultAllocator::allocate::20] Error Code 1: Cuda Runtime (out of memory)
[E] 9: Skipping tactic0x0000000000000000 due to exception [::0] autotuning: User allocator error allocating 34360788477-byte buffer
[W] UNSUPPORTED_STATESkipping tactic 0 due to insufficient memory on requested size of 9261154304 detected for tactic 0x0000000000000000.
[E] 1: [defaultAllocator.cpp::nvinfer1::internal::DefaultAllocator::allocate::20] Error Code 1: Cuda Runtime (out of memory)
[E] 9: Skipping tactic0x0000000000000000 due to exception [::0] autotuning: User allocator error allocating 34360788477-byte buffer
[E] 1: [defaultAllocator.cpp::nvinfer1::internal::DefaultAllocator::allocate::20] Error Code 1: Cuda Runtime (out of memory)
[E] 9: Skipping tactic0x0000000000000000 due to exception [::0] autotuning: User allocator error allocating 34360788477-byte buffer
[E] 1: [defaultAllocator.cpp::nvinfer1::internal::DefaultAllocator::allocate::20] Error Code 1: Cuda Runtime (out of memory)
[E] 9: Skipping tactic0x0000000000000000 due to exception [::0] autotuning: User allocator error allocating 34360788477-byte buffer
Building engine: 100%|##########| 6/6 [02:52<00:00, 28.76s/it]
[I] Finished engine building in 173.326 seconds
[I] Saving tactic timing cache to C:\Users\DjSpa\AppData\Roaming\StabilityMatrix\Packages\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\timing_caches\timing_cache_win_cc86.cache
[I] Saving engine to C:\Users\DjSpa\AppData\Roaming\StabilityMatrix\Packages\stable-diffusion-webui\models\Unet-trt\unrealRealism_v40_d1c341b5_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt            

I can see exported models in the''Unet-onnx'' and ''Unet-trt'' folders but there is no ''trt'' files on ''sd_unet'' sections. I have only ''Automatic'' and ''None'' and sometimes it stucks at ''[INFO]: No ONNX file found. Exporting ONNX…'' and i had to restart webui. then same error.
I have 3070ti if that matters.

asdf51
adsadas

Copy Pasted TensorRT not visible why?

I am trying to prepare pre-compiled TensorRT files

Copy pasted
sd_xl_base_1.0_be9edd61.onnx
folder : Unet-onnx

Copy pasted :
sd_xl_base_1.0_be9edd61_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x2048+2x77x2048+8x154x2048-y=1x2816+2x2816+8x2816.trt
folder : Unet-trt

But in dropdown they do not appear

Dev version of Automatic1111

No ONNX file found. Exporting ONNX... Disabling attention optimization

#54

I found the same issue in the previous issue in the link above, but the cause and solution are different, so I'm writing a new one, probably a bug

cpt1

When you change an Advanced Setting the first time you use Export Engine, the following error occurs

`Loading weights [7e9a9b674e] from C:\stable-diffusion-webui\models\Stable-diffusion\1.5\refslaveV2_v2.safetensors
Loading VAE weights specified in settings: C:\stable-diffusion-webui\models\VAE\1.5\kl-f8-anime2.ckpt
Applying attention optimization: xformers... done.
Weights loaded in 4.4s (send model to cpu: 0.7s, load weights from disk: 0.5s, apply weights to model: 2.1s, load VAE: 0.4s, move model to device: 0.7s).
Exporting 1.5_refslaveV2_v2 to TensorRT
{'sample': [(1, 4, 64, 64), (2, 4, 128, 128), (8, 4, 256, 256)], 'timesteps': [(1,), (2,), (8,)], 'encoder_hidden_states': [(1, 77, 768), (2, 77, 768), (8, 154, 768)]}
No ONNX file found. Exporting ONNX...
Disabling attention optimization
ERROR:root:
Traceback (most recent call last):
File "C:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 74, in export_onnx
inputs = modelobj.get_sample_input(
File "C:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\models.py", line 977, in get_sample_input
latent_height, latent_width = self.check_dims(
File "C:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\models.py", line 263, in check_dims
assert (
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "C:\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "C:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 135, in export_unet_to_trt
export_onnx(
File "C:\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 129, in export_onnx
exit()
File "C:\Users\USER\AppData\Local\Programs\Python\Python310\lib_sitebuiltins.py", line 26, in call
raise SystemExit(code)
SystemExit: None
`

cpt2 cpt3

However, if you run Export Engine with the default option, it will appear to be an error, but Unet will be created normally, and if you run a new Export Engine after changing Advanced Settings, it will be created without error.

If this is a technically required option, rather than a bug, it seems like it needs some additional clarification on the right to avoid confusion

CUDA out of memory error (on 24GB 4090)

I am using the hot_fix branch as I couldn't get the main branch to export.
Receiving this error whenever I try and generate anything. tried using a static 512x512 engine, with batch size and count at 1. And it made no difference, it always tries to allocate 40.5 GB.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 40.50 GiB (GPU 0; 23.99 GiB total capacity; 2.56 GiB already allocated; 19.69 GiB free; 2.62 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

ModuleNotFoundError: No module named 'tensorrt_bindings'

Hello,

I've follow the instructions to install the TensorRT extension. But I've encountered 3 problem:

  1. I've not found the Generate Default Engine Button described in the README.md
  2. After installation, when I start up the webui, I get these erro messages:
*** Error loading script: trt.py
    Traceback (most recent call last):
      File "C:\Users\ferna\SD\modules\scripts.py", line 382, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\ferna\SD\modules\script_loading.py", line 10, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 940, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "C:\Users\ferna\SD\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 10, in <module>
        import ui_trt
      File "C:\Users\ferna\SD\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 10, in <module>
        from exporter import export_onnx, export_trt
      File "C:\Users\ferna\SD\extensions\Stable-Diffusion-WebUI-TensorRT\exporter.py", line 10, in <module>
        from utilities import Engine
      File "C:\Users\ferna\SD\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 32, in <module>
        import tensorrt as trt
      File "C:\Users\ferna\SD\venv\Lib\site-packages\tensorrt\__init__.py", line 18, in <module>
        from tensorrt_bindings import *
    ModuleNotFoundError: No module named 'tensorrt_bindings'
  1. No models under SD Unet.

Much appreciated if someone can help me.

Comfyui Support?

Hello, I would like to request a ComfyUI repo that makes using TensorRT easier to use with ComfyUI rather than CLI args. I think this would be beneficial especially for benchmark tests as A1111 isn't well optimized for inference (it's actually the worst out of the ui bunch). I think your analytics would get better numbers making this switch. Thanks in advance for your input.

Use lora_model_name.json to retrive metadata

As suggested in #1

in webui using extra networks tab we allow user to manually set what the type of lora model
the information is stored beside the model under the same name but with .json extension

particularly
we do this in webui because some older models or non safetensors does not have the metadata

so the the order of retrieving the version information should be
lora_model_name.json > safetensors metadata > "unknow"

Error with SDXL Lora

I switched to dev branch and SDXL works. However When# i try to convert my trained SDXL Lora i get errors.
.
.
.
Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_31_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_31_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_32_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_32_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_33_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_33_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_34_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_34_output_0

Add Constant onnx::Unsqueeze_33460

ERROR:root:Failed to add Constant onnx::Unsqueeze_33460

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_35_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/Constant_35_output_0

Add Constant onnx::Unsqueeze_33464

ERROR:root:Failed to add Constant onnx::Unsqueeze_33464

Add Constant onnx::Unsqueeze_33466

ERROR:root:Failed to add Constant onnx::Unsqueeze_33466

Add Constant onnx::Unsqueeze_33471

ERROR:root:Failed to add Constant onnx::Unsqueeze_33471

Add Constant onnx::Unsqueeze_33473

ERROR:root:Failed to add Constant onnx::Unsqueeze_33473

Add Constant onnx::Unsqueeze_33475

ERROR:root:Failed to add Constant onnx::Unsqueeze_33475

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_1_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_2_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_2_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_3_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_3_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_4_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_4_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_5_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_5_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_6_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_6_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_7_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_7_output_0

Add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_8_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/Constant_8_output_0

Add Constant /output_blocks.5/output_blocks.5.1/Constant_11_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/Constant_11_output_0

Add Constant /output_blocks.5/output_blocks.5.1/Constant_12_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/Constant_12_output_0

Add Constant /output_blocks.5/output_blocks.5.1/Constant_13_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/Constant_13_output_0

Add Constant /output_blocks.5/output_blocks.5.1/Constant_14_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.1/Constant_14_output_0

Add Constant onnx::Unsqueeze_33535

ERROR:root:Failed to add Constant onnx::Unsqueeze_33535

Add Constant onnx::Unsqueeze_33537

ERROR:root:Failed to add Constant onnx::Unsqueeze_33537

Add Constant onnx::Unsqueeze_33539

ERROR:root:Failed to add Constant onnx::Unsqueeze_33539

Add Constant onnx::Unsqueeze_33541

ERROR:root:Failed to add Constant onnx::Unsqueeze_33541

Add Constant onnx::Unsqueeze_33546

ERROR:root:Failed to add Constant onnx::Unsqueeze_33546

Add Constant onnx::Unsqueeze_33548

ERROR:root:Failed to add Constant onnx::Unsqueeze_33548

Add Constant onnx::Unsqueeze_33550

ERROR:root:Failed to add Constant onnx::Unsqueeze_33550

Add Constant onnx::Unsqueeze_33552

ERROR:root:Failed to add Constant onnx::Unsqueeze_33552

Add Constant /output_blocks.5/output_blocks.5.2/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.5/output_blocks.5.2/Constant_output_0

Add Constant /output_blocks.6/output_blocks.6.0/in_layers/in_layers.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/in_layers/in_layers.0/Constant_output_0

Add Constant /output_blocks.6/output_blocks.6.0/in_layers/in_layers.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/in_layers/in_layers.0/Constant_1_output_0

Add Constant /output_blocks.6/output_blocks.6.0/in_layers/in_layers.0/Constant_2_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/in_layers/in_layers.0/Constant_2_output_0

Add Constant /output_blocks.6/output_blocks.6.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/Constant_output_0

Add Constant /output_blocks.6/output_blocks.6.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/Constant_1_output_0

Add Constant /output_blocks.6/output_blocks.6.0/out_layers/out_layers.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/out_layers/out_layers.0/Constant_output_0

Add Constant /output_blocks.6/output_blocks.6.0/out_layers/out_layers.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/out_layers/out_layers.0/Constant_1_output_0

Add Constant /output_blocks.6/output_blocks.6.0/out_layers/out_layers.0/Constant_2_output_0

ERROR:root:Failed to add Constant /output_blocks.6/output_blocks.6.0/out_layers/out_layers.0/Constant_2_output_0

Add Constant /output_blocks.7/output_blocks.7.0/in_layers/in_layers.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/in_layers/in_layers.0/Constant_output_0

Add Constant /output_blocks.7/output_blocks.7.0/in_layers/in_layers.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/in_layers/in_layers.0/Constant_1_output_0

Add Constant /output_blocks.7/output_blocks.7.0/in_layers/in_layers.0/Constant_2_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/in_layers/in_layers.0/Constant_2_output_0

Add Constant /output_blocks.7/output_blocks.7.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/Constant_output_0

Add Constant /output_blocks.7/output_blocks.7.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/Constant_1_output_0

Add Constant /output_blocks.7/output_blocks.7.0/out_layers/out_layers.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/out_layers/out_layers.0/Constant_output_0

Add Constant /output_blocks.7/output_blocks.7.0/out_layers/out_layers.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/out_layers/out_layers.0/Constant_1_output_0

Add Constant /output_blocks.7/output_blocks.7.0/out_layers/out_layers.0/Constant_2_output_0

ERROR:root:Failed to add Constant /output_blocks.7/output_blocks.7.0/out_layers/out_layers.0/Constant_2_output_0

Add Constant /output_blocks.8/output_blocks.8.0/in_layers/in_layers.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/in_layers/in_layers.0/Constant_output_0

Add Constant /output_blocks.8/output_blocks.8.0/in_layers/in_layers.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/in_layers/in_layers.0/Constant_1_output_0

Add Constant /output_blocks.8/output_blocks.8.0/in_layers/in_layers.0/Constant_2_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/in_layers/in_layers.0/Constant_2_output_0

Add Constant /output_blocks.8/output_blocks.8.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/Constant_output_0

Add Constant /output_blocks.8/output_blocks.8.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/Constant_1_output_0

Add Constant /output_blocks.8/output_blocks.8.0/out_layers/out_layers.0/Constant_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/out_layers/out_layers.0/Constant_output_0

Add Constant /output_blocks.8/output_blocks.8.0/out_layers/out_layers.0/Constant_1_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/out_layers/out_layers.0/Constant_1_output_0

Add Constant /output_blocks.8/output_blocks.8.0/out_layers/out_layers.0/Constant_2_output_0

ERROR:root:Failed to add Constant /output_blocks.8/output_blocks.8.0/out_layers/out_layers.0/Constant_2_output_0

Add Constant /out/out.0/Constant_output_0

ERROR:root:Failed to add Constant /out/out.0/Constant_output_0

Add Constant /out/out.0/Constant_1_output_0

ERROR:root:Failed to add Constant /out/out.0/Constant_1_output_0

Add Constant /out/out.0/Constant_2_output_0

ERROR:root:Failed to add Constant /out/out.0/Constant_2_output_0

Finished refit. Dumping result to disk.
Traceback (most recent call last):
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\gradio\blocks.py", line 1117, in call_function
prediction = await utils.async_iteration(iterator)
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\gradio\utils.py", line 350, in async_iteration
return await iterator.anext()
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\gradio\utils.py", line 343, in anext
return await anyio.to_thread.run_sync(
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\gradio\utils.py", line 326, in run_sync_iterator_async
return next(iterator)
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\gradio\utils.py", line 695, in gen_wrapper
yield from f(*args, **kwargs)
File "A:\stable-diffusion-webui-dev\extensions\Stable-Diffusion-WebUI-TensorRT\ui_trt.py", line 299, in export_lora_to_trt
engine.refit(onnx_base_path, onnx_lora_path, dump_refit_path=trt_lora_path)
File "A:\stable-diffusion-webui-dev\extensions\Stable-Diffusion-WebUI-TensorRT\utilities.py", line 277, in refit
save_file(
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\safetensors\numpy.py", line 74, in save_file
if not _is_little_endian(tensor):
File "A:\stable-diffusion-webui-dev\venv\lib\site-packages\safetensors\numpy.py", line 167, in _is_little_endian
byteorder = tensor.dtype.byteorder
AttributeError: 'NoneType' object has no attribute 'dtype'

Is it currently not possible or am i doing something wrong?

unable to export engines h11._util.LocalProtocolError

re-posting as issue post as it seems this is how you like to do things hear
example fix PR #3
the following is copied from AUTOMATIC1111/stable-diffusion-webui-extensions#205 (comment)


one major issue that I'm being experiencing randomly is this
it almost appear every time I try to export an engine
sometimes when this happens the process completely holds other times it continues I managed to get one successfully completed and use it to run sd with trt and I can say that it is significantly faster
but most of the time my engine fails to export, making this almost unusable

one major issue that I'm being experiencing randomly is this
it almost appear every time I try to export an engine
sometimes when this happens the process completely holds other times it continues I managed to get one successfully completed and use it to run sd with trt and I can say that it is significantly faster
but most of the time my engine fails to export, making this almost unusable

Building TensorRT engine for B:\GitHub\stable-diffusion-webui\models\Unet-onnx\Anime_Anything-V3.0_Anything-V3.0-fp16_Anything-V3.0-pruned-fp16_38c1ebe3.onnx
: B:\GitHub\stable-diffusion-webui\models\Unet-trt\Anime_Anything-V3.0_Anything-V3.0-fp16_Anything-V3.0-pruned-fp16_38c1ebe3_cc86_sample=1x4x64x64+2x4x64x64+
8x4x96x96-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
ERROR:asyncio:Exception in callback H11Protocol.timeout_keep_alive_handler()
handle: <TimerHandle when=355099.875 H11Protocol.timeout_keep_alive_handler()>
Traceback (most recent call last):
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 249, in _fire_event_triggered_transitions
    new_state = EVENT_TRIGGERED_TRANSITIONS[role][state][event_type]
KeyError: <class 'h11._events.ConnectionClosed'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Programs\Python\3.10.6\lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 363, in timeout_keep_alive_handler
    self.conn.send(event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 468, in send
    data_list = self.send_with_data_passthrough(event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 493, in send_with_data_passthrough
    self._process_event(self.our_role, event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_connection.py", line 242, in _process_event
    self._cstate.process_event(role, type(event), server_switch_event)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 238, in process_event
    self._fire_event_triggered_transitions(role, event_type)
  File "B:\GitHub\stable-diffusion-webui\venv\lib\site-packages\h11\_state.py", line 251, in _fire_event_triggered_transitions
    raise LocalProtocolError(
h11._util.LocalProtocolError: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE

the engines does seem to be generated (saved to disk) but the profiles are just not created so webui doesn't see it

Cause and patch fix

as far that I can see this is caused by yield logging_history
I've made an edit to replace all yield logging_history with print(logging_history) and it seems to work fine

UI can failed to load due to LoRA model missing metadata or none safetensors filetypes

PR #1
non-catched exception with get_lora_checkpoints()
AssertionError when non safetensors filetypes
TypeError ss_sd_model_name is None

in webui using extra networks tab we allow user to manually set what the type of lora model
the information is stored beside the model under the same name but with .json extension
we do this in webui because some older models or non safetensors does not have the metadata

so the the order of retrieving the version information should be
lora_model_name.json > safetensors metadata > "unknow"

ValueError: No valid profile found. Please go to the TensorRT tab and generate an engine with the necessary profile. Or use the default (torch) U-Net.

Built and successfully exported a tensorRT profile for photon SD 1.5 model, selected it in the sd_unet, trying to generate with empty prompts, get this error. Same with another 1.5 model:

Activating unet: [TRT] Realistic_photon_v1
Loading TensorRT engine: F:\projects\AI\SD3\stable-diffusion-webui\models\Unet-trt\Realistic_photon_v1_d902a082_cc86_sample=1x4x64x64+2x4x64x64+8x4x96x96-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
[I] Loading bytes from F:\projects\AI\SD3\stable-diffusion-webui\models\Unet-trt\Realistic_photon_v1_d902a082_cc86_sample=1x4x64x64+2x4x64x64+8x4x96x96-timesteps=1+2+8-encoder_hidden_states=1x77x768+2x77x768+8x154x768.trt
Profile 0:
sample = [(1, 4, 64, 64), (2, 4, 64, 64), (8, 4, 96, 96)]
timesteps = [(1,), (2,), (8,)]
encoder_hidden_states = [(1, 77, 768), (2, 77, 768), (8, 154, 768)]
latent = [(-1945910016), (-1945902336), (-1945908224)]

0%| | 0/24 [00:00<?, ?it/s]
*** Error completing request
*** Arguments: ('task(jbhvo1u7acymvla)', '', '', [], 24, 'DPM++ 2M SDE Karras', 1, 8, 6.5, 768, 768, False, 0.5, 1.25, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x000001848280F2B0>, 0, False, '', 0.8, -1, False, -1, 0, 0, 0, 0, False, 1, False, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 1536, 96, True, True, True, False, True, False, 1, False, False, False, 1.1, 1.5, 100, 0.7, False, False, True, False, False, 0, 'Gustavosta/MagicPrompt-Stable-Diffusion', '', False, 7, 100, 'Constant', 0, 'Constant', 0, 4, True, 'MEAN', 'AD', 1, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x00000184DF8F2E00>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x0000018679FE9DB0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x00000184DF8F17B0>, False, 'None', 20, 1, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, 50) {}
Traceback (most recent call last):
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\call_queue.py", line 57, in f
res = list(func(*args, **kwargs))
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\call_queue.py", line 36, in f
res = func(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\txt2img.py", line 55, in txt2img
processed = processing.process_images(p)
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\processing.py", line 732, in process_images
res = process_images_inner(p)
File "F:\projects\AI\SD3\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\processing.py", line 867, in process_images_inner
samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\processing.py", line 1140, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 235, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\sd_samplers_common.py", line 261, in launch_sampling
return func()
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 235, in
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "F:\projects\AI\SD3\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 626, in sample_dpmpp_2m_sde
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "F:\projects\AI\SD3\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\sd_samplers_cfg_denoiser.py", line 169, in forward
x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
File "F:\projects\AI\SD3\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in
setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call
return self.__orig_func(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "F:\projects\AI\SD3\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1335, in forward
out = self.diffusion_model(x, t, context=cc)
File "F:\projects\AI\SD3\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\modules\sd_unet.py", line 89, in UNetModel_forward
return current_unet.forward(x, timesteps, context, *args, **kwargs)
File "F:\projects\AI\SD3\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 87, in forward
self.switch_engine(feed_dict)
File "F:\projects\AI\SD3\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\scripts\trt.py", line 108, in switch_engine
raise ValueError(
ValueError: No valid profile found. Please go to the TensorRT tab and generate an engine with the necessary profile. Or use the default (torch) U-Net.

specs:

RTX 3070 TI 8gb version: v1.6.0  •  python: 3.10.11  •  torch: 2.0.1+cu118  •  xformers: 0.0.20  •  gradio: 3.41.2  •  checkpoint: ec41bd2a82

What exactly does "Optimal" mean?

ask2

when l export a TensorRT-based Engine, I know exactly what Min/Max Width and Height are for, but I don't know what Optimal is, and I didn't find anything about it in the description on the right. Does anyone know anything about it?

Error installing TensorRT extension on SD.NEXT

Using latest version on master branch of SD.NEXT:
11:09:56-340763 INFO Version: app=sd.next updated=2023-10-14 hash=98bf24d7
url=https://github.com/vladmandic/automatic.git/tree/master

Added the extension, then saw an installer error. In my SDNEXT.log file I found the following:

`2023-10-17 11:10:36,666 | sd | DEBUG | installer | TensorRT is not installed! Installing...

C:\sdnext\venv\lib\site-packages\pytorch_lightning\utilities\distributed.py:258: LightningDeprecationWarning: pytorch_lightning.utilities.distributed.rank_zero_only has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from pytorch_lightning.utilities instead.

rank_zero_deprecation(

+--------------------- Traceback (most recent call last) ---------------------+

| C:\sdnext\extensions_TensorRT\install.py:30 in |

| |

| 27 shared.opts["quicksettings_list"].append("sd_unet") |

| 28 shared.opts.save(shared.config_filename) |

| 29 |

| > 30 install() |

| 31 |

| |

| C:\sdnext\extensions_TensorRT\install.py:8 in install |

| |

| 5 if not launch.is_installed("tensorrt"): |

| 6 print("TensorRT is not installed! Installing...") |

| 7 launch.run_pip("install nvidia-cudnn-cu11==8.9.4.25", "nvidia- |

| > 8 launch.run_pip("install --pre --extra-index-url https://pypi.n |

| 9 launch.run(["python","-m","pip","uninstall","-y","nvidia-cudnn |

| 10 |

| 11 # Polygraphy |

+-----------------------------------------------------------------------------+

TypeError: run_pip() got an unexpected keyword argument 'live'`

Full Info Provided - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

yep i just shown this

on dev branch of auto1111 it fails to generate onnx

so we generate on master branch and use on dev branch haha

Watch below video to learn how to compile SDXL TensorRT and use it - it includes both manual and auto way

RTX Acceleration Quick Tutorial With Auto Installer V2 SDXL - Tensor RT

image

Here literally everything that you may need. I can also follow your instructions and test them

SD 1.5 based models are working i even made a tutorial

RTX Acceleration Quick Tutorial With Auto Installer

image

Done on a fresh install. Python 3.10.11 and cudnn windows-x86_64-8.9.4.25

First let me show pip freeze

Microsoft Windows [Version 10.0.19045.3570]
(c) Microsoft Corporation. All rights reserved.

G:\auto_quick\stable-diffusion-webui\venv\Scripts>activate

(venv) G:\auto_quick\stable-diffusion-webui\venv\Scripts>pip freeze
absl-py==2.0.0
accelerate==0.21.0
addict==2.4.0
aenum==3.1.15
aiofiles==23.2.1
aiohttp==3.8.6
aiosignal==1.3.1
altair==5.1.2
antlr4-python3-runtime==4.9.3
anyio==3.7.1
async-timeout==4.0.3
attrs==23.1.0
basicsr==1.4.2
beautifulsoup4==4.12.2
blendmodes==2022
boltons==23.0.0
cachetools==5.3.1
certifi==2023.7.22
charset-normalizer==3.3.0
clean-fid==0.1.35
click==8.1.7
clip==1.0
colorama==0.4.6
contourpy==1.1.1
cycler==0.12.1
deprecation==2.1.0
einops==0.4.1
exceptiongroup==1.1.3
facexlib==0.3.0
fastapi==0.94.0
ffmpy==0.3.1
filelock==3.12.4
filterpy==1.4.5
fonttools==4.43.1
frozenlist==1.4.0
fsspec==2023.9.2
ftfy==6.1.1
future==0.18.3
gdown==4.7.1
gfpgan==1.3.8
gitdb==4.0.10
GitPython==3.1.32
google-auth==2.23.3
google-auth-oauthlib==1.1.0
gradio==3.41.2
gradio_client==0.5.0
grpcio==1.59.0
h11==0.12.0
httpcore==0.15.0
httpx==0.24.1
huggingface-hub==0.18.0
idna==3.4
imageio==2.31.5
importlib-metadata==6.8.0
importlib-resources==6.1.0
inflection==0.5.1
Jinja2==3.1.2
jsonmerge==1.8.0
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
kiwisolver==1.4.5
kornia==0.6.7
lark==1.1.2
lazy_loader==0.3
lightning-utilities==0.9.0
llvmlite==0.41.0
lmdb==1.4.1
lpips==0.1.4
Markdown==3.5
MarkupSafe==2.1.3
matplotlib==3.8.0
mpmath==1.3.0
multidict==6.0.4
networkx==3.1
numba==0.58.0
numpy==1.23.5
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-nvrtc-cu11==11.8.89
nvidia-cuda-runtime-cu11==2022.4.25
nvidia-cuda-runtime-cu117==11.7.60
nvidia-cudnn-cu11==8.9.4.25
oauthlib==3.2.2
omegaconf==2.2.3
onnx==1.14.1
onnx-graphsurgeon==0.3.27
open-clip-torch==2.20.0
opencv-python==4.8.1.78
orjson==3.9.9
packaging==23.2
pandas==2.1.1
piexif==1.1.3
Pillow==9.5.0
platformdirs==3.11.0
polygraphy==0.49.0
protobuf==3.20.2
psutil==5.9.5
pyasn1==0.5.0
pyasn1-modules==0.3.0
pydantic==1.10.13
pydub==0.25.1
pyparsing==3.1.1
PySocks==1.7.1
python-dateutil==2.8.2
python-multipart==0.0.6
pytorch-lightning==1.9.4
pytz==2023.3.post1
PyWavelets==1.4.1
PyYAML==6.0.1
realesrgan==0.3.0
referencing==0.30.2
regex==2023.10.3
requests==2.31.0
requests-oauthlib==1.3.1
resize-right==0.0.2
rpds-py==0.10.6
rsa==4.9
safetensors==0.3.1
scikit-image==0.21.0
scipy==1.11.3
semantic-version==2.10.0
sentencepiece==0.1.99
six==1.16.0
smmap==5.0.1
sniffio==1.3.0
soupsieve==2.5
starlette==0.26.1
sympy==1.12
tb-nightly==2.15.0a20231017
tensorboard-data-server==0.7.1
tensorrt==9.0.1.post11.dev4
tensorrt-bindings==9.0.1.post11.dev4
tensorrt-libs==9.0.1.post11.dev4
tifffile==2023.9.26
timm==0.9.2
tokenizers==0.13.3
tomesd==0.1.3
tomli==2.0.1
toolz==0.12.0
torch==2.0.1+cu118
torchdiffeq==0.2.3
torchmetrics==1.2.0
torchsde==0.2.5
torchvision==0.15.2+cu118
tqdm==4.66.1
trampoline==0.1.2
transformers==4.30.2
typing_extensions==4.8.0
tzdata==2023.3
urllib3==2.0.7
uvicorn==0.23.2
wcwidth==0.2.8
websockets==11.0.3
Werkzeug==3.0.0
xformers==0.0.20
yapf==0.40.2
yarl==1.9.2
zipp==3.17.0

(venv) G:\auto_quick\stable-diffusion-webui\venv\Scripts>

Here the full log of the Automatic1111

venv "G:\auto_quick\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: protobuf==3.20.2 in g:\auto_quick\stable-diffusion-webui\venv\lib\site-packages (3.20.2)
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com, https://pypi.ngc.nvidia.com
Requirement already satisfied: onnx-graphsurgeon in g:\auto_quick\stable-diffusion-webui\venv\lib\site-packages (0.3.27)
Requirement already satisfied: numpy in g:\auto_quick\stable-diffusion-webui\venv\lib\site-packages (from onnx-graphsurgeon) (1.23.5)
Requirement already satisfied: onnx in g:\auto_quick\stable-diffusion-webui\venv\lib\site-packages (from onnx-graphsurgeon) (1.14.1)
Requirement already satisfied: typing-extensions>=3.6.2.1 in g:\auto_quick\stable-diffusion-webui\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (4.8.0)
Requirement already satisfied: protobuf>=3.20.2 in g:\auto_quick\stable-diffusion-webui\venv\lib\site-packages (from onnx->onnx-graphsurgeon) (3.20.2)
GS is not installed! Installing...
Installing protobuf
Installing onnx-graphsurgeon
UI Config not initialized
Launching Web UI with arguments: --xformers
Loading weights [31e35c80fc] from G:\auto_quick\stable-diffusion-webui\models\Stable-diffusion\sd_xl_base_1.0.safetensors
Creating model from config: G:\auto_quick\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
Loading VAE weights specified in settings: G:\auto_quick\stable-diffusion-webui\models\VAE\fp16_sdxl_vae.safetensors
Applying attention optimization: xformers... done.
Model loaded in 3.4s (load weights from disk: 0.7s, create model: 0.2s, apply weights to model: 2.2s, load VAE: 0.1s, calculate empty prompt: 0.1s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 19.0s (prepare environment: 9.3s, import torch: 1.6s, import gradio: 0.5s, setup paths: 0.4s, initialize shared: 0.2s, other imports: 0.3s, load scripts: 3.0s, create ui: 3.5s, gradio launch: 0.2s).
{'sample': [(1, 4, 96, 96), (2, 4, 128, 128), (8, 4, 128, 128)], 'timesteps': [(1,), (2,), (8,)], 'encoder_hidden_states': [(1, 77, 2048), (2, 77, 2048), (8, 154, 2048)], 'y': [(1, 2816), (2, 2816), (8, 2816)]}
Building TensorRT engine for G:\auto_quick\stable-diffusion-webui\models\Unet-onnx\sd_xl_base_1.0_be9edd61.onnx: G:\auto_quick\stable-diffusion-webui\models\Unet-trt\sd_xl_base_1.0_be9edd61_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x2048+2x77x2048+8x154x2048-y=1x2816+2x2816+8x2816.trt
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Loading tactic timing cache from G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\timing_caches\timing_cache_win_cc86.cache
[I] Building engine with configuration:
    Flags                  | [FP16, REFIT, TF32]
    Engine Capability      | EngineCapability.DEFAULT
    Memory Pools           | [WORKSPACE: 24563.50 MiB, TACTIC_DRAM: 24563.50 MiB]
    Tactic Sources         | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
    Profiling Verbosity    | ProfilingVerbosity.LAYER_NAMES_ONLY
    Preview Features       | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]
Building engine: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [07:38<00:00, 76.41s/it]
[I] Finished engine building in 462.608 seconds
[I] Saving tactic timing cache to G:\auto_quick\stable-diffusion-webui\extensions\Stable-Diffusion-WebUI-TensorRT\timing_caches\timing_cache_win_cc86.cache
[I] Saving engine to G:\auto_quick\stable-diffusion-webui\models\Unet-trt\sd_xl_base_1.0_be9edd61_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x2048+2x77x2048+8x154x2048-y=1x2816+2x2816+8x2816.trt
Downloading VAEApprox model to: G:\auto_quick\stable-diffusion-webui\models\VAE-approx\vaeapprox-sdxl.pt
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 209k/209k [00:00<00:00, 1.43MB/s]
Activating unet: [TRT] sd_xl_base_1.0
Loading TensorRT engine: G:\auto_quick\stable-diffusion-webui\models\Unet-trt\sd_xl_base_1.0_be9edd61_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x2048+2x77x2048+8x154x2048-y=1x2816+2x2816+8x2816.trt
[I] Loading bytes from G:\auto_quick\stable-diffusion-webui\models\Unet-trt\sd_xl_base_1.0_be9edd61_cc86_sample=1x4x96x96+2x4x128x128+8x4x128x128-timesteps=1+2+8-encoder_hidden_states=1x77x2048+2x77x2048+8x154x2048-y=1x2816+2x2816+8x2816.trt
Profile 0:
        sample = [(1, 4, 96, 96), (2, 4, 128, 128), (8, 4, 128, 128)]
        timesteps = [(1,), (2,), (8,)]
        encoder_hidden_states = [(1, 77, 2048), (2, 77, 2048), (8, 154, 2048)]
        y = [(1, 2816), (2, 2816), (8, 2816)]
        latent = [(-1946014208), (-1946015985), (-1946020080)]

  0%|                                                                                                                      | 0/150 [00:00<?, ?it/s]
*** Error completing request
*** Arguments: ('task(nxb62g6nrm8ro2s)', 'photo of a car ', '', [], 150, 'DPM++ 2M SDE Karras', 1, 1, 7, 1024, 1024, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x000001DBD1C626B0>, 0, False, '', 0.8, 2608786895, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "G:\auto_quick\stable-diffusion-webui\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "G:\auto_quick\stable-diffusion-webui\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\modules\txt2img.py", line 55, in txt2img
        processed = processing.process_images(p)
      File "G:\auto_quick\stable-diffusion-webui\modules\processing.py", line 732, in process_images
        res = process_images_inner(p)
      File "G:\auto_quick\stable-diffusion-webui\modules\processing.py", line 867, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "G:\auto_quick\stable-diffusion-webui\modules\processing.py", line 1140, in sample
        samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
      File "G:\auto_quick\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 235, in sample
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "G:\auto_quick\stable-diffusion-webui\modules\sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "G:\auto_quick\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 235, in <lambda>
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 626, in sample_dpmpp_2m_sde
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\modules\sd_samplers_cfg_denoiser.py", line 169, in forward
        x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\modules\sd_models_xl.py", line 37, in apply_model
        return self.model(x, t, cond)
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "G:\auto_quick\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
        return self.__orig_func(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\repositories\generative-models\sgm\modules\diffusionmodules\wrappers.py", line 28, in forward
        return self.diffusion_model(
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\repositories\generative-models\sgm\modules\diffusionmodules\openaimodel.py", line 984, in forward
        emb = self.time_embed(t_emb)
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
        input = module(input)
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "G:\auto_quick\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 429, in network_Linear_forward
        return originals.Linear_forward(self, input)
      File "G:\auto_quick\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
        return F.linear(input, self.weight, self.bias)
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)

---

Tasks

No tasks being tracked yet.

No ONNX file found. Exporting ONNX…

Can't "export default engine". I've tried 20 different checkpoints, all return the same error immediately:

Exporting realisticVisionV40_v40VAE to TensorRT
[INFO]: No ONNX file found. Exporting ONNX…

torch/onnx error with certain models (aten::copy)

============= Diagnostic Run torch.onnx.export version 2.0.1+cu118 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 1 ERROR ========================
ERROR: missing-standard-symbolic-function
=========================================
Exporting the operator 'aten::copy' to ONNX opset version 17 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.
None
<Set verbose=True to see more details>

ERROR:root:Exporting the operator 'aten::copy' to ONNX opset version 17 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.