Git Product home page Git Product logo

Comments (9)

saminatorkash avatar saminatorkash commented on July 28, 2024

Completely I agree that we should be able to use as much swap memory as we please.
same here. using a 8GB model. I haven't tried your solution of changing the value of 1.5 to more in allocation. I will play with it and let it update.

python txt2image.py "A photo of an astronaut riding a horse on Mars." --n_images 1 --n_rows 2
diffusion_pytorch_model.safetensors: 100%|█| 3.46G/3.46G [09:10<00:00, 6.29MB/s]
text_encoder/config.json: 100%|█████████████████| 613/613 [00:00<00:00, 906kB/s]
model.safetensors: 100%|███████████████████| 1.36G/1.36G [04:41<00:00, 4.83MB/s]
vae/config.json: 100%|██████████████████████████| 553/553 [00:00<00:00, 947kB/s]
diffusion_pytorch_model.safetensors: 100%|███| 335M/335M [00:57<00:00, 5.81MB/s]
tokenizer/vocab.json: 100%|████████████████| 1.06M/1.06M [00:00<00:00, 1.18MB/s]
tokenizer/merges.txt: 100%|███████████████████| 525k/525k [00:00<00:00, 882kB/s]
100%|███████████████████████████████████████████| 50/50 [05:19<00:00,  6.39s/it]
  0%|                                                                                                                                 | 0/1 [00:00<?, ?it/s]libc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 134217728 bytes.
Abort trap: 6
UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

from mlx.

swamyg avatar swamyg commented on July 28, 2024
libc++abi: terminating due to uncaught exception of type std::runtime_error:
[malloc_or_wait] Unable to allocate 100237312 bytes.

I'm running into this error on my M3 Max, 36GB of ram when trying to run lora.py from ml-examples. Trying to fine tune mistral-7B model.

100237312 bytes doesn't seem that much, not sure why it's failing.

from mlx.

awni avatar awni commented on July 28, 2024

See ml-explore/mlx-examples#70 for some ideas around how to reduce Lora memory consumption until we have quantization.

from mlx.

s-smits avatar s-smits commented on July 28, 2024
Screenshot 2024-04-24 at 09 21 32 Also got a VRAM error while ~2GB more available than 9.8GB as shown in terminal when loading Phi-3. Is is possible to put the VRAM limit to `max_available_at_initiating` or something like that? So that other applications only take up swap.

from mlx.

awni avatar awni commented on July 28, 2024

There is a maximum size you can allocate into a single buffer (which is a machine specific property). I think it is less than 9.8 GB for you.

But either way the fact that you are trying to put 9GB into a single buffer is not a good sign. What are you running to get that? Is it from training or generation?

from mlx.

s-smits avatar s-smits commented on July 28, 2024

It is a 16GB Air M1, do you happen to know a ballpark of the limit? Or is it dynamically dependent of other processes?
I was running a Phi-3-128k-mlx mlx_lm.utils load and generate function with ~6k context (when I run again it says 12.2GB needed), is it only limited to 8GB of VRAM? With PyTorch I am able to run 14GB of Python files without much of a speed loss (with around ~4-5GB swap of the top of my head).

from mlx.

awni avatar awni commented on July 28, 2024

It is a 16GB Air M1, do you happen to know a ballpark of the limit?

I don't know but you could try running this until it breaks:

import mlx.core as mx

mx.metal.set_cache_limit(0)
for i in range(100):
    print(f"{i} GB")
    a = mx.zeros((2**30, i), mx.bool_)
    mx.eval(a)
    del a

from mlx.

awni avatar awni commented on July 28, 2024

I'm going to close this issue as I'm not sure why it's still open. Feel free to file a new issue if you are still having issues with memory allocation.

from mlx.

s-smits avatar s-smits commented on July 28, 2024
air@MacBook-Air-van-Air test-repo % /opt/homebrew/bin/python3.
10 /Users/air/Repositories/test-repo/test4.py
0 GB
1 GB
2 GB
3 GB
4 GB
5 GB
6 GB
7 GB
8 GB
9 GB
libc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 9663676416 bytes.
zsh: abort      /opt/homebrew/bin/python3.10 ```
Just an FYI, no need for me to open a new issue, thank you.

from mlx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.