Git Product home page Git Product logo

Comments (5)

djparente avatar djparente commented on June 18, 2024 2

Hrm. I have encountered this issue also on a 3GB GTX 1060. No matter the image size (even, for example, -s 16 16) it will inevitably crash after a few iterations with an out of memory error, similar to the one above. I can get it running on the CPU (although slowly).

I am wondering if there is a memory leak somewhere: once the train / ascend_txt loop begins, I would expect memory utilization to remain approximately stable. I expanded one of the lines in ascend_txt to:

    out = synth(z)
    mcout = make_cutouts(out)
    nmcout = normalize(mcout)
    encoded = perceptor.encode_image(nmcout) <- Commenting this out seems to resolve the crash
    iii = encoded.float()

Some further experimentation makes me wonder if there is a problem with CLIP/clip/model.py at:

 def forward(self, x: torch.Tensor):
        x = self.conv1(x)  # shape = [*, width, grid, grid]
        x = x.reshape(x.shape[0], x.shape[1], -1)  # shape = [*, width, grid ** 2]
        x = x.permute(0, 2, 1)  # shape = [*, grid ** 2, width]
        x = torch.cat([self.class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device), x], dim=1)  # shape = [*, grid ** 2 + 1, width]
        x = x + self.positional_embedding.to(x.dtype)
        x = self.ln_pre(x)

        x = x.permute(1, 0, 2)  # NLD -> LND
        x = self.transformer(x) # <- Commenting this out also resolves the crash
        x = x.permute(1, 0, 2)  # LND -> NLD

        x = self.ln_post(x[:, 0, :])

        if self.proj is not None:
            x = x @ self.proj

        return x

I don't think I understand CUDA or Torch well enough to propose a solution. I tried following the further path that self.transformer(x) calls and added some del statements, but wasn't able to resolve a possible leak.

Any insight you have, nerdyrodent? Thanks for all your work on this really interesting package.

from vqgan-clip.

nerdyrodent avatar nerdyrodent commented on June 18, 2024

Good question, and one I'd like to know the answer too as well!

from vqgan-clip.

zhanghongyong123456 avatar zhanghongyong123456 commented on June 18, 2024

Good question, and one I'd like to know the answer too as well!

I also want to know how to get a larger resolution under the condition of limited GPU,I have been looking for this answer when I first tried

from vqgan-clip.

nerdyrodent avatar nerdyrodent commented on June 18, 2024

With just 3GB VRAM I'd personally use the colab. If you really want to use just 3GB VRAM, try:

python generate.py -p "An apple" -s 64 64 -cuts 4

from vqgan-clip.

rlallen-nps avatar rlallen-nps commented on June 18, 2024

Hrm. I have encountered this issue also on a 3GB GTX 1060. No matter the image size (even, for example, -s 16 16) it will inevitably crash after a few iterations with an out of memory error, similar to the one above. I can get it running on the CPU (although slowly).

I am wondering if there is a memory leak somewhere: once the train / ascend_txt loop begins, I would expect memory utilization to remain approximately stable. I expanded one of the lines in ascend_txt to:

    out = synth(z)
    mcout = make_cutouts(out)
    nmcout = normalize(mcout)
    encoded = perceptor.encode_image(nmcout) <- Commenting this out seems to resolve the crash
    iii = encoded.float()

Some further experimentation makes me wonder if there is a problem with CLIP/clip/model.py at:

 def forward(self, x: torch.Tensor):
        x = self.conv1(x)  # shape = [*, width, grid, grid]
        x = x.reshape(x.shape[0], x.shape[1], -1)  # shape = [*, width, grid ** 2]
        x = x.permute(0, 2, 1)  # shape = [*, grid ** 2, width]
        x = torch.cat([self.class_embedding.to(x.dtype) + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device), x], dim=1)  # shape = [*, grid ** 2 + 1, width]
        x = x + self.positional_embedding.to(x.dtype)
        x = self.ln_pre(x)

        x = x.permute(1, 0, 2)  # NLD -> LND
        x = self.transformer(x) # <- Commenting this out also resolves the crash
        x = x.permute(1, 0, 2)  # LND -> NLD

        x = self.ln_post(x[:, 0, :])

        if self.proj is not None:
            x = x @ self.proj

        return x

I don't think I understand CUDA or Torch well enough to propose a solution. I tried following the further path that self.transformer(x) calls and added some del statements, but wasn't able to resolve a possible leak.

Any insight you have, nerdyrodent? Thanks for all your work on this really interesting package.

Please let us know if anyone made progress here.

from vqgan-clip.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.