Git Product home page Git Product logo

Comments (8)

dacorvo avatar dacorvo commented on June 26, 2024

I did more tests, changing the default compiler optimization option from -O2 to -O1, and I am able to use the same configurations I used with transformers-neuronx==0.6.106: batch_size=1 and n_positions=2048.

During inference, the device memory is at 64 Gb for the 13B model and 22 Gb for the 7B model.

I also tested with -O3 but got the same kind of errors.

from transformers-neuronx.

aws-rhsoln avatar aws-rhsoln commented on June 26, 2024

Thank you for reporting the issue. We are replicating the issue on our end and will get back with a fix.

from transformers-neuronx.

santhoshkolloju avatar santhoshkolloju commented on June 26, 2024

Hi
What’s the throughput tokens/sec did u get on 7 billion model ?

from transformers-neuronx.

dacorvo avatar dacorvo commented on June 26, 2024

With the 2.14.1 compiler (neuronx-cc), I am able to compile the llama2 7B model with -O1 for different batch sizes.

I tested several combinations of cores / batch size with the default maximum sequence length for llama model (2048).

Here are the results:

| cores/batch | 128 tokens | 512 tokens | 1024 tokens | 2048 tokens | Throughput   |
|-------------|------------|------------|-------------|-------------|--------------|
| 2c / bs2    | 8.5 s      | 34 s       | 69 s        | 143 s       | 29 tokens/s  |
| 2c / bs4    | 8.6 s      | 35 s       | 72 s        | 150 s       | 55 tokens/s  |
| 24c / bs2   | 1.3 s      | 5.4 s      | 11.5 s      | 22.8 s      | 180 tokens/s |
| 24c / bs4   | 1.4 s      | 5.8 s      | 11.5 s      | 24 s        | 341 tokens/s |

Note: I experienced extremely long compilation times for batch size 4 (more than 3 hours), even with -O1, when it takes only minutes for batch size 1 or 2.

from transformers-neuronx.

awsilya avatar awsilya commented on June 26, 2024

@dacorvo thank you for confirming. Yes, batch 4 compilation time is an issue, we are working on it and it's been tracked elsewhere. I'm closing this one.

from transformers-neuronx.

awsilya avatar awsilya commented on June 26, 2024

closing

from transformers-neuronx.

dacorvo avatar dacorvo commented on June 26, 2024

On most open-source projects, issues are closed only when they have been resolved, so that users:

  • users reporting the issue can be notified when a fix is pushed,
  • new users facing the issues later can be redirected to the proper version.
    How can we track progress on these compilation errors now that you've closed this one ? Can you link it to the relevant issues ?

from transformers-neuronx.

hannanjgaws avatar hannanjgaws commented on June 26, 2024

Hi @dacorvo:

We confirmed that the Llama 7B compilation error you reported is fixed in the 2.15.2 Release. Can you install the latest Neuron SDK and try re-running your script to confirm that you no longer see compilation issues for this model?

from transformers-neuronx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.