Git Product home page Git Product logo

generative-models's People

Contributors

akx avatar benjaminaubin avatar brycedrennan avatar eltociear avatar gen-ai-experts avatar jenuk avatar johngull avatar lantiga avatar palp avatar patrickvonplaten avatar pharmapsychotic avatar qp-qp avatar rromb avatar timudk avatar voletiv avatar yvrjsharma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

generative-models's Issues

Autoencoder issues

The published autoencoder weights do not seem to match the model defined in this repo. Specifically when I try to load the weights, the following keys are missing
post_quant_conv.bias
post_quant_conv.weight
quant_conv.bias
quant_conv.weight

Also would be really helpful if the actual yaml config used to train the SDXL autoencoder is published (the example in this repo does not seem to correspond to the original).

wrong output for sdxl

I play with the provided streamlit demo, and set fp16 to the diffusion model (for saving vram), however, the output is not correct. for example,

13d27f03ee2224d98bc30901d2e2fc16476c374189c716d2c92b1df3

while I can get the right output for sd v2.1 768 version.

Can someone help me for this problem?

pip3.exe install -r requirements_pt2.txt error

when i run this command in windows 11

$ pip3.exe install -r requirements_pt2.txt
Obtaining taming-transformers from git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers (from -r requirements_pt2.txt (line 38))

......
ERROR: Ignored the following versions that require a different python version: 0.55.2 Requires-Python <3.5; 1.6.2 Requires-Python >=3.7,<3.10; 1.6.3 Requires-Python >=3.7,<3.10; 1.7.0 Requires-Python >=3.7,<3.10; 1.7.1 Requires-Python >=3.7,<3.10
ERROR: Could not find a version that satisfies the requirement triton==2.0.0 (from versions: none)
ERROR: No matching distribution found for triton==2.0.0
(.pt2)

Tierney@BDM /d/Github/generative-models (main)
$ python -V
Python 3.10.6
(.pt2)

Tierney@BDM /d/Github/generative-models (main)
$ pip -V
pip 23.2 from D:\Github\generative-models.pt2\lib\site-packages\pip (python 3.10)
(.pt2)

Tierney@BDM /d/Github/generative-models (main)
$ ^C
(.pt2)

Cannot use with error

Easy Diffusion v2.5.48
operating system Arch-based
got:
Error: Could not load the stable-diffusion model! Reason: Missing key unet_config full_key: model.params.unet_config object_type=dict

Design Choices

Hi! First of all thanks for releasing such a great model and accompanying paper. Could you clarify few design choices in the SDXL?

  1. Why do you use both previous CLIP-L and new OpenCLIP ViT-bigG? Have you tried only using the later one, wouldn't it be enough?
  2. The crop-conditioning while avoid generating too many cropped images, seems to generate more duplicated cases, where the object of interest is present everywhere, instead of being a single instance. See this comparisons. I wonder why not to use multi-aspect ( aka rectangles) training during all training process, rather than only during fine-tuning.

FrozenT5Embedder

Could you tell me how to use FrozenT5Embeder 260 in the sgm/modules/encoders/modules. py file with the 2.1 model

ImportError: libGL.so.1 on reproducible environnement

Hi, Following the provided instructions, I've created a gitpod branch to automate these instructions, I keep getting this error even after installing required os packages:

  File "/workspace/generative-models/.pt2/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
  File "/workspace/generative-models/scripts/demo/sampling.py", line 2, in <module>
    from scripts.demo.streamlit_helpers import *
  File "/workspace/generative-models/scripts/demo/streamlit_helpers.py", line 10, in <module>
    from imwatermark import WatermarkEncoder
  File "/workspace/generative-models/.pt2/lib/python3.10/site-packages/imwatermark/__init__.py", line 1, in <module>
    from .watermark import WatermarkEncoder, WatermarkDecoder
  File "/workspace/generative-models/.pt2/lib/python3.10/site-packages/imwatermark/watermark.py", line 5, in <module>
    import cv2
  File "/workspace/generative-models/.pt2/lib/python3.10/site-packages/cv2/__init__.py", line 181, in <module>
    bootstrap()
  File "/workspace/generative-models/.pt2/lib/python3.10/site-packages/cv2/__init__.py", line 153, in bootstrap
    native_module = importlib.import_module("cv2")
  File "/home/gitpod/.pyenv/versions/3.10.9/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

Do I have to download a specific model from huggingface?

SD XL 0.9 base OutOfMemoryError with 24 GB VRAM GPU

I was wondering how much VRAM is required to use this model. An NVIDIA GeForce RTX 3090 GPU with 24 GB VRAM seems to be insufficient when using the default settings on a clean install of Lubuntu 18.04 with CUDA 11.4 and PyTorch 2.0.1+cu117.

Is this the expected behavior or are there things that can be done to reduce memory usage? I tried varying the resolution, but it did not seem to have much of an effect.

Here is a graph of the memory usage when trying to generate an image at either 512 or 1024 resolution with streamlit run scripts/demo/sampling.py --server.port <your_port>

gpu_usage

The script takes roughly 50 seconds to initialize and uses about 15 GB VRAM until the prompt can be entered. After submitting the prompt, it takes roughly 35 seconds to crash.

This is the corresponding output.

Global seed set to 42
Building a Downsample layer with 2 dims.
  --> settings are: 
 in-chn: 320, out-chn: 320, kernel-size: 3, stride: 2, padding: 1
constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Building a Downsample layer with 2 dims.
  --> settings are: 
 in-chn: 640, out-chn: 640, kernel-size: 3, stride: 2, padding: 1
constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 2048 and using 20 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads
WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 2048 and using 10 heads with a dimension of 64.
BasicTransformerBlock is using checkpointing
Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: ['vision_model.encoder.layers.1.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.mlp.fc2.bias', 'vision_model.encoder.layers.22.mlp.fc2.weight', 'vision_model.encoder.layers.2.self_attn.v_proj.weight', 'vision_model.encoder.layers.4.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.q_proj.weight', 'vision_model.encoder.layers.23.mlp.fc1.bias', 'vision_model.encoder.layers.9.self_attn.v_proj.bias', 'vision_model.encoder.layers.9.layer_norm2.bias', 'vision_model.encoder.layers.17.self_attn.q_proj.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.weight', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.18.layer_norm2.weight', 'vision_model.encoder.layers.9.self_attn.q_proj.bias', 'vision_model.encoder.layers.20.layer_norm1.weight', 'vision_model.encoder.layers.14.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.mlp.fc2.bias', 'vision_model.encoder.layers.5.self_attn.k_proj.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.bias', 'vision_model.encoder.layers.5.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.weight', 'vision_model.encoder.layers.19.layer_norm1.bias', 'vision_model.encoder.layers.22.layer_norm1.bias', 'vision_model.encoder.layers.18.layer_norm2.bias', 'vision_model.encoder.layers.22.layer_norm1.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.bias', 'vision_model.encoder.layers.5.layer_norm2.weight', 'vision_model.encoder.layers.1.layer_norm2.bias', 'vision_model.encoder.layers.18.layer_norm1.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.weight', 'vision_model.encoder.layers.0.layer_norm2.bias', 'vision_model.encoder.layers.15.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.layer_norm2.bias', 'vision_model.encoder.layers.6.mlp.fc1.bias', 'vision_model.encoder.layers.1.layer_norm1.weight', 'vision_model.encoder.layers.16.mlp.fc1.bias', 'vision_model.encoder.layers.12.self_attn.q_proj.weight', 'vision_model.encoder.layers.20.mlp.fc2.weight', 'vision_model.encoder.layers.5.mlp.fc2.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.weight', 'vision_model.encoder.layers.19.mlp.fc1.weight', 'vision_model.encoder.layers.13.layer_norm2.weight', 'vision_model.encoder.layers.23.self_attn.out_proj.bias', 'vision_model.encoder.layers.11.layer_norm2.weight', 'vision_model.encoder.layers.17.mlp.fc2.weight', 'vision_model.encoder.layers.13.self_attn.out_proj.bias', 'vision_model.encoder.layers.8.layer_norm1.weight', 'vision_model.encoder.layers.9.mlp.fc1.bias', 'vision_model.encoder.layers.12.mlp.fc1.weight', 'vision_model.encoder.layers.11.self_attn.q_proj.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.bias', 'vision_model.encoder.layers.18.mlp.fc2.weight', 'vision_model.encoder.layers.10.self_attn.q_proj.bias', 'vision_model.encoder.layers.20.self_attn.q_proj.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.bias', 'vision_model.encoder.layers.18.self_attn.q_proj.weight', 'vision_model.encoder.layers.0.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.self_attn.v_proj.weight', 'vision_model.encoder.layers.19.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.layer_norm2.weight', 'vision_model.encoder.layers.11.mlp.fc1.bias', 'vision_model.encoder.layers.6.mlp.fc2.weight', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.encoder.layers.6.self_attn.q_proj.weight', 'vision_model.encoder.layers.19.self_attn.k_proj.bias', 'vision_model.encoder.layers.3.self_attn.v_proj.weight', 'vision_model.encoder.layers.22.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.mlp.fc1.weight', 'vision_model.encoder.layers.11.layer_norm1.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.weight', 'text_projection.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.weight', 'vision_model.encoder.layers.20.mlp.fc2.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.bias', 'vision_model.encoder.layers.10.mlp.fc2.bias', 'vision_model.encoder.layers.1.layer_norm1.bias', 'vision_model.encoder.layers.6.mlp.fc1.weight', 'vision_model.encoder.layers.14.self_attn.k_proj.bias', 'vision_model.encoder.layers.21.self_attn.k_proj.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.weight', 'vision_model.encoder.layers.4.layer_norm2.weight', 'vision_model.encoder.layers.8.mlp.fc2.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.weight', 'vision_model.encoder.layers.3.self_attn.v_proj.bias', 'vision_model.encoder.layers.22.mlp.fc2.bias', 'vision_model.post_layernorm.bias', 'vision_model.encoder.layers.4.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.mlp.fc1.bias', 'vision_model.encoder.layers.23.layer_norm2.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.out_proj.bias', 'vision_model.encoder.layers.12.self_attn.k_proj.bias', 'vision_model.encoder.layers.2.mlp.fc2.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.mlp.fc2.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.mlp.fc2.weight', 'vision_model.encoder.layers.5.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.v_proj.bias', 'vision_model.encoder.layers.7.mlp.fc2.weight', 'vision_model.encoder.layers.10.mlp.fc1.weight', 'vision_model.encoder.layers.18.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.2.layer_norm1.weight', 'vision_model.encoder.layers.2.layer_norm2.bias', 'vision_model.encoder.layers.8.layer_norm1.bias', 'vision_model.encoder.layers.14.mlp.fc2.weight', 'vision_model.encoder.layers.23.layer_norm1.weight', 'vision_model.encoder.layers.19.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.self_attn.k_proj.bias', 'vision_model.encoder.layers.3.mlp.fc2.weight', 'vision_model.encoder.layers.5.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.self_attn.out_proj.bias', 'vision_model.encoder.layers.8.self_attn.v_proj.bias', 'vision_model.encoder.layers.4.layer_norm1.weight', 'vision_model.encoder.layers.6.layer_norm2.weight', 'vision_model.encoder.layers.22.mlp.fc1.weight', 'vision_model.encoder.layers.11.mlp.fc2.bias', 'vision_model.encoder.layers.17.self_attn.k_proj.weight', 'vision_model.encoder.layers.12.mlp.fc2.bias', 'vision_model.encoder.layers.1.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.layer_norm1.bias', 'vision_model.encoder.layers.23.mlp.fc2.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.mlp.fc1.bias', 'vision_model.encoder.layers.21.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.self_attn.v_proj.weight', 'vision_model.encoder.layers.19.self_attn.q_proj.weight', 'vision_model.encoder.layers.11.self_attn.v_proj.bias', 'vision_model.encoder.layers.12.layer_norm2.bias', 'vision_model.encoder.layers.20.layer_norm2.weight', 'vision_model.encoder.layers.22.mlp.fc1.bias', 'vision_model.encoder.layers.18.self_attn.k_proj.weight', 'vision_model.encoder.layers.15.self_attn.out_proj.weight', 'vision_model.encoder.layers.10.self_attn.out_proj.weight', 'vision_model.encoder.layers.22.self_attn.k_proj.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.bias', 'vision_model.encoder.layers.16.mlp.fc1.weight', 'vision_model.encoder.layers.22.self_attn.q_proj.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.mlp.fc1.weight', 'vision_model.encoder.layers.9.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.layer_norm2.weight', 'vision_model.encoder.layers.20.layer_norm2.bias', 'vision_model.encoder.layers.9.self_attn.v_proj.weight', 'vision_model.encoder.layers.5.self_attn.out_proj.weight', 'vision_model.encoder.layers.19.mlp.fc2.weight', 'vision_model.encoder.layers.7.layer_norm2.bias', 'vision_model.encoder.layers.15.mlp.fc1.weight', 'vision_model.encoder.layers.14.self_attn.k_proj.weight', 'vision_model.encoder.layers.12.layer_norm1.weight', 'vision_model.encoder.layers.3.layer_norm1.weight', 'vision_model.encoder.layers.16.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.mlp.fc1.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.weight', 'vision_model.encoder.layers.21.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.weight', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.12.mlp.fc2.weight', 'vision_model.encoder.layers.20.self_attn.out_proj.weight', 'vision_model.encoder.layers.13.layer_norm2.bias', 'vision_model.encoder.layers.8.layer_norm2.bias', 'vision_model.encoder.layers.16.layer_norm1.bias', 'vision_model.encoder.layers.21.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.layer_norm2.weight', 'vision_model.encoder.layers.23.layer_norm2.weight', 'vision_model.encoder.layers.12.self_attn.q_proj.bias', 'vision_model.encoder.layers.8.layer_norm2.weight', 'vision_model.encoder.layers.4.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.self_attn.out_proj.bias', 'vision_model.encoder.layers.20.layer_norm1.bias', 'vision_model.encoder.layers.1.mlp.fc1.bias', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.1.mlp.fc2.weight', 'vision_model.encoder.layers.11.self_attn.k_proj.weight', 'vision_model.encoder.layers.10.self_attn.k_proj.weight', 'visual_projection.weight', 'vision_model.encoder.layers.11.layer_norm1.bias', 'vision_model.encoder.layers.13.self_attn.v_proj.bias', 'vision_model.encoder.layers.20.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.mlp.fc2.bias', 'vision_model.encoder.layers.19.layer_norm2.weight', 'vision_model.encoder.layers.5.mlp.fc2.bias', 'vision_model.encoder.layers.2.self_attn.k_proj.weight', 'logit_scale', 'vision_model.encoder.layers.3.self_attn.q_proj.weight', 'vision_model.encoder.layers.11.self_attn.out_proj.bias', 'vision_model.encoder.layers.13.self_attn.k_proj.weight', 'vision_model.encoder.layers.8.self_attn.out_proj.bias', 'vision_model.encoder.layers.13.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.weight', 'vision_model.encoder.layers.22.self_attn.q_proj.weight', 'vision_model.encoder.layers.13.self_attn.q_proj.bias', 'vision_model.encoder.layers.2.mlp.fc2.bias', 'vision_model.encoder.layers.7.layer_norm2.weight', 'vision_model.encoder.layers.16.self_attn.out_proj.weight', 'vision_model.encoder.layers.10.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.layer_norm1.bias', 'vision_model.encoder.layers.3.layer_norm2.bias', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.15.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.layer_norm2.bias', 'vision_model.encoder.layers.6.self_attn.out_proj.weight', 'vision_model.encoder.layers.9.self_attn.q_proj.weight', 'vision_model.encoder.layers.14.self_attn.v_proj.weight', 'vision_model.encoder.layers.6.layer_norm1.weight', 'vision_model.encoder.layers.0.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.mlp.fc2.bias', 'vision_model.encoder.layers.4.mlp.fc1.bias', 'vision_model.encoder.layers.5.self_attn.v_proj.bias', 'vision_model.encoder.layers.6.layer_norm2.bias', 'vision_model.encoder.layers.17.layer_norm1.weight', 'vision_model.encoder.layers.6.layer_norm1.bias', 'vision_model.encoder.layers.9.self_attn.out_proj.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.bias', 'vision_model.encoder.layers.19.mlp.fc1.bias', 'vision_model.post_layernorm.weight', 'vision_model.encoder.layers.6.mlp.fc2.bias', 'vision_model.encoder.layers.19.layer_norm2.bias', 'vision_model.encoder.layers.22.layer_norm2.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.bias', 'vision_model.encoder.layers.16.layer_norm2.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.layer_norm1.bias', 'vision_model.encoder.layers.16.self_attn.q_proj.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.self_attn.v_proj.weight', 'vision_model.encoder.layers.1.self_attn.out_proj.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.bias', 'vision_model.encoder.layers.19.layer_norm1.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.bias', 'vision_model.encoder.layers.14.mlp.fc2.bias', 'vision_model.encoder.layers.14.layer_norm1.bias', 'vision_model.encoder.layers.3.layer_norm2.weight', 'vision_model.encoder.layers.4.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.2.mlp.fc1.weight', 'vision_model.encoder.layers.16.self_attn.k_proj.weight', 'vision_model.encoder.layers.17.mlp.fc1.bias', 'vision_model.encoder.layers.8.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.mlp.fc1.bias', 'vision_model.encoder.layers.12.self_attn.v_proj.weight', 'vision_model.encoder.layers.14.mlp.fc1.bias', 'vision_model.encoder.layers.23.self_attn.v_proj.bias', 'vision_model.encoder.layers.2.self_attn.q_proj.bias', 'vision_model.encoder.layers.9.layer_norm1.weight', 'vision_model.encoder.layers.22.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.self_attn.k_proj.bias', 'vision_model.encoder.layers.17.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.layer_norm2.bias', 'vision_model.encoder.layers.16.mlp.fc2.bias', 'vision_model.encoder.layers.7.mlp.fc1.weight', 'vision_model.encoder.layers.17.self_attn.q_proj.weight', 'vision_model.encoder.layers.7.self_attn.v_proj.bias', 'vision_model.pre_layrnorm.weight', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.15.self_attn.k_proj.bias', 'vision_model.encoder.layers.1.self_attn.q_proj.bias', 'vision_model.encoder.layers.23.mlp.fc2.bias', 'vision_model.encoder.layers.13.mlp.fc1.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.weight', 'vision_model.encoder.layers.10.layer_norm2.bias', 'vision_model.encoder.layers.12.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.weight', 'vision_model.encoder.layers.13.layer_norm1.bias', 'vision_model.encoder.layers.5.layer_norm2.bias', 'vision_model.encoder.layers.3.mlp.fc1.weight', 'vision_model.encoder.layers.9.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.self_attn.k_proj.bias', 'vision_model.encoder.layers.18.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.layer_norm2.weight', 'vision_model.encoder.layers.2.mlp.fc1.bias', 'vision_model.encoder.layers.4.self_attn.k_proj.weight', 'vision_model.encoder.layers.7.self_attn.v_proj.weight', 'vision_model.encoder.layers.3.layer_norm1.bias', 'vision_model.encoder.layers.22.self_attn.v_proj.bias', 'vision_model.encoder.layers.23.mlp.fc1.weight', 'vision_model.encoder.layers.0.self_attn.q_proj.weight', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.mlp.fc1.weight', 'vision_model.encoder.layers.16.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.0.self_attn.v_proj.weight', 'vision_model.encoder.layers.17.self_attn.k_proj.bias', 'vision_model.encoder.layers.21.mlp.fc1.weight', 'vision_model.encoder.layers.21.mlp.fc2.weight', 'vision_model.encoder.layers.9.mlp.fc2.weight', 'vision_model.encoder.layers.9.mlp.fc2.bias', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.11.self_attn.out_proj.weight', 'vision_model.encoder.layers.11.self_attn.k_proj.bias', 'vision_model.encoder.layers.17.mlp.fc1.weight', 'vision_model.encoder.layers.11.layer_norm2.bias', 'vision_model.encoder.layers.20.self_attn.v_proj.bias', 'vision_model.encoder.layers.1.mlp.fc1.weight', 'vision_model.encoder.layers.1.self_attn.out_proj.weight', 'vision_model.encoder.layers.21.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.layer_norm2.weight', 'vision_model.encoder.layers.15.layer_norm1.weight', 'vision_model.encoder.layers.12.layer_norm2.weight', 'vision_model.encoder.layers.17.mlp.fc2.bias', 'vision_model.encoder.layers.16.layer_norm1.weight', 'vision_model.encoder.layers.14.layer_norm1.weight', 'vision_model.encoder.layers.14.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.layer_norm1.weight', 'vision_model.encoder.layers.7.layer_norm1.bias', 'vision_model.encoder.layers.6.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.mlp.fc2.bias', 'vision_model.encoder.layers.14.mlp.fc1.weight', 'vision_model.encoder.layers.18.self_attn.k_proj.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.bias', 'vision_model.encoder.layers.8.mlp.fc1.weight', 'vision_model.encoder.layers.23.self_attn.v_proj.weight', 'vision_model.encoder.layers.10.mlp.fc2.weight', 'vision_model.encoder.layers.4.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.bias', 'vision_model.encoder.layers.3.mlp.fc1.bias', 'vision_model.embeddings.class_embedding', 'vision_model.encoder.layers.9.mlp.fc1.weight', 'vision_model.encoder.layers.12.layer_norm1.bias', 'vision_model.encoder.layers.9.layer_norm2.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.weight', 'vision_model.encoder.layers.10.layer_norm2.weight', 'vision_model.encoder.layers.10.layer_norm1.bias', 'vision_model.encoder.layers.16.self_attn.out_proj.bias', 'vision_model.encoder.layers.9.self_attn.k_proj.weight', 'vision_model.encoder.layers.18.self_attn.v_proj.bias', 'vision_model.encoder.layers.1.mlp.fc2.bias', 'vision_model.encoder.layers.17.self_attn.out_proj.weight', 'vision_model.encoder.layers.11.mlp.fc1.weight', 'vision_model.encoder.layers.13.mlp.fc2.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.bias', 'vision_model.encoder.layers.4.layer_norm1.bias', 'vision_model.encoder.layers.15.self_attn.v_proj.bias', 'vision_model.encoder.layers.2.self_attn.q_proj.weight', 'vision_model.encoder.layers.8.mlp.fc1.bias', 'vision_model.encoder.layers.13.layer_norm1.weight', 'vision_model.encoder.layers.5.self_attn.q_proj.weight', 'vision_model.encoder.layers.12.mlp.fc1.bias', 'vision_model.encoder.layers.3.self_attn.q_proj.bias', 'vision_model.encoder.layers.20.self_attn.q_proj.bias', 'vision_model.encoder.layers.4.mlp.fc2.weight', 'vision_model.encoder.layers.20.self_attn.v_proj.weight', 'vision_model.encoder.layers.1.layer_norm2.weight', 'vision_model.encoder.layers.10.layer_norm1.weight', 'vision_model.encoder.layers.15.self_attn.q_proj.bias', 'vision_model.encoder.layers.18.layer_norm1.bias', 'vision_model.encoder.layers.8.self_attn.k_proj.bias', 'vision_model.encoder.layers.2.self_attn.v_proj.bias', 'vision_model.encoder.layers.15.self_attn.k_proj.weight', 'vision_model.encoder.layers.15.mlp.fc2.bias', 'vision_model.encoder.layers.2.layer_norm1.bias', 'vision_model.encoder.layers.8.mlp.fc2.weight', 'vision_model.encoder.layers.6.self_attn.v_proj.weight', 'vision_model.encoder.layers.2.layer_norm2.weight', 'vision_model.encoder.layers.12.self_attn.k_proj.weight', 'vision_model.encoder.layers.15.mlp.fc2.weight', 'vision_model.encoder.layers.8.self_attn.out_proj.weight', 'vision_model.encoder.layers.7.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.bias', 'vision_model.encoder.layers.5.mlp.fc1.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.weight', 'vision_model.encoder.layers.15.self_attn.v_proj.weight', 'vision_model.pre_layrnorm.bias', 'vision_model.encoder.layers.5.layer_norm1.bias', 'vision_model.encoder.layers.14.layer_norm2.bias', 'vision_model.encoder.layers.21.layer_norm2.bias', 'vision_model.encoder.layers.20.mlp.fc1.bias', 'vision_model.encoder.layers.9.layer_norm1.bias', 'vision_model.encoder.layers.4.mlp.fc2.bias', 'vision_model.embeddings.position_ids', 'vision_model.encoder.layers.21.mlp.fc1.bias', 'vision_model.encoder.layers.11.self_attn.v_proj.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.weight', 'vision_model.encoder.layers.21.layer_norm1.bias', 'vision_model.encoder.layers.18.mlp.fc1.bias']
- This IS expected if you are initializing CLIPTextModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CLIPTextModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False
Initialized embedder #1: FrozenOpenCLIPEmbedder2 with 694659841 params. Trainable: False
Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False
Initialized embedder #3: ConcatTimestepEmbedderND with 0 params. Trainable: False
Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Loading model from checkpoints/sd_xl_base_0.9.safetensors
unexpected keys:
['denoiser.log_sigmas']
Global seed set to 42
Global seed set to 42
Global seed set to 42
txt [69, 69]
target_size_as_tuple torch.Size([2, 2])
original_size_as_tuple torch.Size([2, 2])
crop_coords_top_left torch.Size([2, 2])
##############################  Sampling setting  ##############################
Sampler: EulerEDMSampler
Discretization: LegacyDDPMDiscretization
Guider: VanillaCFG
Sampling with EulerEDMSampler for 51 steps:  98%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▊  | 50/51 [00:27<00:00,  1.79it/s]
2023-07-08 11:23:12.395 Uncaught app exception
Traceback (most recent call last):
  File "/home/username/miniconda3/envs/sdxl/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
  File "/home/username/Desktop/generative-models/scripts/demo/sampling.py", line 292, in <module>
    out = run_txt2img(
  File "/home/username/Desktop/generative-models/scripts/demo/sampling.py", line 133, in run_txt2img
    out = do_sample(
  File "/home/username/Desktop/generative-models/scripts/demo/streamlit_helpers.py", line 522, in do_sample
    samples_x = model.decode_first_stage(samples_z)
  File "/home/username/miniconda3/envs/sdxl/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/username/Desktop/generative-models/sgm/models/diffusion.py", line 121, in decode_first_stage
    out = self.first_stage_model.decode(z)
  File "/home/username/Desktop/generative-models/sgm/models/autoencoder.py", line 315, in decode
    dec = self.decoder(z, **decoder_kwargs)
  File "/home/username/miniconda3/envs/sdxl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/username/Desktop/generative-models/sgm/modules/diffusionmodules/model.py", line 728, in forward
    h = self.up[i_level].block[i_block](h, temb, **kwargs)
  File "/home/username/miniconda3/envs/sdxl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/username/Desktop/generative-models/sgm/modules/diffusionmodules/model.py", line 131, in forward
    h = nonlinearity(h)
  File "/home/username/Desktop/generative-models/sgm/modules/diffusionmodules/model.py", line 46, in nonlinearity
    return x * torch.sigmoid(x)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 23.70 GiB total capacity; 20.91 GiB already allocated; 464.69 MiB free; 21.56 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Little inefficiencies in "sgm/util.py" code

There are a few parts in the code that are a little inefficient or unnecessary:

  1. Unnecessary Conversion: In the log_txt_as_img function, txts is converted to a NumPy array and then immediately converted to a Torch tensor. This conversion can be skipped, and txts can be directly returned as a NumPy array.

  2. Inefficient Loop: In the log_txt_as_img function, the loop that creates the txt images can be optimized. Instead of creating a new txt image in each iteration, the loop can be simplified by creating a single txt image and then modifying it for each caption.

I am also submitting a PR with these necessary changes.

training config for SDXL

Hi, thanks for sharing the model! I'm trying to finetune the SDXL model but found some inconsistencies in the training and inference configs. For example, configs/example_training/txt2img-clipl.yaml has a different conditioner setup than configs/inference/sd_xl_base.yaml. Could you provide a config file suitable for finetuning the model? Thanks!

How to get stable-datasets?

I wanted to check toy training examples. But faced a problem in that it requires a stable-datasets submodule which is not accessible.
Please assist in getting this module.

Thanks.

SDXL Different Weights with CLIP2

There seems to be an inconsistency with the text encoder 2 weights. With the sd_xl_base_1.0.safetensors, I view conditioner.embedders.1.model.text_projection and it is:

tensor([[-0.0348, 0.0218, -0.0025, ..., -0.0034, -0.0138, -0.0075],
[-0.0037, 0.0202, -0.0014, ..., -0.0082, -0.0011, 0.0170],
[ 0.0261, -0.0221, -0.0099, ..., -0.0233, -0.0178, -0.0061],
...,
[ 0.0042, -0.0068, 0.0086, ..., 0.0008, -0.0030, -0.0042],
[-0.0236, 0.0094, 0.0040, ..., -0.0098, 0.0330, 0.0147],
[ 0.0084, -0.0021, -0.0049, ..., 0.0026, -0.0055, -0.0294]],
dtype=torch.float16)

However if I go into text_encoder_2/model.fp16.safetensors, I see text_projection.weight is:

tensor([[-0.0348, -0.0037, 0.0261, ..., 0.0042, -0.0236, 0.0084],
[ 0.0218, 0.0202, -0.0221, ..., -0.0068, 0.0094, -0.0021],
[-0.0025, -0.0014, -0.0099, ..., 0.0086, 0.0040, -0.0049],
...,
[-0.0034, -0.0082, -0.0233, ..., 0.0008, -0.0098, 0.0026],
[-0.0138, -0.0011, -0.0178, ..., -0.0030, 0.0330, -0.0055],
[-0.0075, 0.0170, -0.0061, ..., -0.0042, 0.0147, -0.0294]],
dtype=torch.float16)

Which one should be correct? They do lead to different results.

Stable Diffusion XL - M1 mac doesn't work with fp16 on tutorial script - LLVM ERROR: Failed to infer result type(s)

Getting this issue still on trying the basic tutorial for SDXL inference (16GB MacBook Pro M1).

This mostly works (if I strip out the tutorial's recommendation for fp16) - but takes forever (iteration time 66 seconds), and then dies on the 50th iteration due to "MPS backend out of memory":

from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", use_safetensors=True)
pipe = pipe.to("mps")
pipe.enable_attention_slicing()
 prompt = "An astronaut riding a green horse"
 images = pipe(prompt=prompt).images[0]

The recommended call:

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")

results in the error previously mentioned:

loc("varianceEps"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/97f6331a-ba75-11ed-a4bc-863efbbaf80d/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":228:0)): error: input types 'tensor<1x77x1xf16>' and 'tensor<1xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
Abort trap: 6

/Users/mike/miniconda3/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Running torch 2.0.1, installed from the requirements.txt as per the README on this repo.

Anything I can do? I've got it working successfully on a 1080 Ti and a T4 (just following tutorial with no modifications), but I'm stuck on the M1.

SDXL 1.0 Tutorials - Not An Issue Thread

Design question: Why don't you use v-prediction target?

Hi! First of all thanks for a very good model. The Stable Diffusion v2 used v-prediction target and argued that it's better than default epsilon prediction, but why do you use the epsilon target for SDXL training again?

How convert img2img to inpaint

Hi, this is a great job, now I want to write an inpaint script on the basis of img2img according to the stablediffusion2.1 project. I need to fuse img and mask information into the network, the following is the method I am using now, but unfortunately it is wrong, can you help me?
16909747001848

Is it possible to implement openai-like image edit and image variation with SDXL

We are trying to replace the image genration of our App with SDXL, we have implemented the image generation api in an OpenAI style. Here is our llm-as-openai repo which tries to replace OpenAI chat apis with open source language model and replace image generation with SDXL.

However, I'm quite new to SDXL. Is it possible to implement the OpenAI Image Edit and the Create image variation with SDXL?

The Image Edit API is used to create an edited or extended image given an original image and a prompt. And the Create image variation is to create a variation of a given image.

Can't get triton package on Windows

Is this repo only for linux? When I try install the requirements it stops at triton image

I checked around and the triton package only supports Linux. Do I need to have a seperate boot for Linux to get this repo working or can I get it working with WSL? Right now I'm stuck with WSL trying to get a disk mounted but that's a whole other issue based on inexperience. I'll work it out tomorrow. Hopefully someone can chime in before then.

RuntimeError: Error(s) in loading state_dict for LatentDiffusion

dear guys:
when i change model to sdxl, i got error below
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LatentDiffusion:
size mismatch for model.diffusion_model.time_embed.0.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([1280, 320])
………………………………………………………………
how can I resolve this problem?

Image to Image - Masking Does not work in SDXL 0.9

Hello SD Team, while using the API and playing with the Image To Image / Masking endpoint, i noticed and has been confirmed by a developer in Discord tha the masking is beeing partially ignored .

Can anyone look into this BUG and fix it? it's important that nothing behind the black pixels are modified by the diffusion.

Awaiting response.

Thank you.
generates0-2023-07-22_15-39-21
generates1-2023-07-22_15-39-21
generates0-2023-07-22_15-29-58
generates3-2023-07-22_15-29-59
generates2-2023-07-22_14-35-56
image (7)
oilbottle

fp16 in the UNet

Hi! SDXL was OOM when I finetuned the UNet. I find that it doesn't use the fp16 in here. Will this be supported?

img2img

I tried passing in an image as I have with earlier models:
images = pipe(prompt=prompt, image=init_img).images[0],
and learned that this doesn't work.

Is there a way to do img2img operations with this model?

Question about the finetuning.

Hi, I would like to finetune SDXL on the custom dataset. I merge the inference config with the training config in order to load the pretrained model. However, there is an error about target_size_as_tuple.

Where should I put the target_hight and target_width in my own dataset?

sampling.py demo crashing

I am pretty new to this stuff. I got wsl up and running and set up cuda and stuff. I got everything downloaded and I am able to start the streamlit application but it quickly get's up to 28 gb in my RAM and then kills the application.

This may be obvious to some but not to me right now. Am I supposed to be changing a setting somewhere to get this working right?

I've got 32gb ram and a 3090ti with 24gb vram. It doesn't seem to be using the gpu much if at all and then I assume it's running out of memory. I am not sure though. I've tried talking to people in the discord but usually my questions get ignored. I don't think many people there are trying to use this repo.

This is what it looks like when it crashes

  • This IS expected if you are initializing CLIPTextModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing CLIPTextModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False
    Killed

if you need any other information I am happy to provide it, just let me know. I'd really like to learn this and get this running as intended.

Running SDXL on multi GPUs

Enabling Multi-GPU Support for SDXL

Dear developers,

I am currently using the SDXL for my project, and I am encountering some difficulties with enabling multi-GPU support. I have four Nvidia 3090 GPUs at my disposal, but so far, I have only been able to run the software on one of them, running such a large model on a single 3090 is very slow and ram cosuming.

My attempts to distribute the workload among all GPUs have been unsuccessful. These have included:

  • Attempt : Use the accelerate to config the multi GPUS and run.

Unfortunately, these methods have not resulted in successful multi-GPU utilization.

For your reference, here is some information about my setup:

  • Operating System: Ubuntu
  • Python Version: 3.10.12
  • CUDA Version:12.2
  • Deep Learning Framework: PyTorch 2.0.1

I'm not sure if I'm missing something or if there's an issue with SDXL itself. Therefore, I'm writing to ask if you could provide some guidance on this matter. Is there a built-in way to enable multi-GPU support, or are there any additional steps that I might have overlooked? Thank you!

No module named 'scripts' when trying to do Inference

I am doing the inference part as the README step by step. But I suffered No module named 'scripts' when running

streamlit run scripts/demo/sampling.py --server.port <your_port>

Did anyone meet this issue and solve it?

The environment I choosed is pt2

Potential bug in BasicTransformerBlock

In the forward-method of BasicTransformerBlock the keywords additional_tokens and n_times_crossframe_attn_in_self are ignored, because these keywords are added to kwargs, but kwargs is not passed to checkpoint, only context is.

So it is impossible to change these keywords right now from the default, which might cause unexpected behaviour.

https://github.com/Stability-AI/generative-models/blob/76e549dd94de6a09f9e75eff82d62377274c00f8/sgm/modules/attention.py#L460C47-L460C47

Preview of above:

    def forward(
        self, x, context=None, additional_tokens=None, n_times_crossframe_attn_in_self=0
    ):
        kwargs = {"x": x}

        if context is not None:
            kwargs.update({"context": context})

        if additional_tokens is not None:
            kwargs.update({"additional_tokens": additional_tokens})

        if n_times_crossframe_attn_in_self:
            kwargs.update(
                {"n_times_crossframe_attn_in_self": n_times_crossframe_attn_in_self}
            )

        # return mixed_checkpoint(self._forward, kwargs, self.parameters(), self.checkpoint)
        return checkpoint(
            self._forward, (x, context), self.parameters(), self.checkpoint
        )

Release of SD-XL unclip version

Hi, SD-XL is a great work for image generation with better visual quality and text-image alignment. I found there is a demo REIMAGINE XL which should be an XL version of SD v2.1 unclip. I am just kindly wondering if the weights of unclip version of SD-XL (i.e. image embedding as input for image variations) will be released? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.