Git Product home page Git Product logo

Comments (6)

yizhongw avatar yizhongw commented on May 26, 2024 3

Hi Leo, thanks for reporting your results. May I know how many GPUs did you use? This is important because the real batch size is per_device_train_batch_size x num_gpus x accumulation steps. In my experiment, I used 8 A100 GPUs, which results in a batch size of 16.

from tk-instruct.

kuri-leo avatar kuri-leo commented on May 26, 2024 1

Hi Leo, thanks for reporting your results. May I know how many GPUs did you use? This is important because the real batch size is per_device_train_batch_size x num_gpus x accumulation steps. In my experiment, I used 8 A100 GPUs, which results in a batch size of 16.

Hey Yizhong,

Thanks for your rapid reply :-)

In my test, I only used one A100 for debugging. So I will try 8 GPU and share the result later.

Thank you again and have a nice day!

Cheers,
Leo

from tk-instruct.

kuri-leo avatar kuri-leo commented on May 26, 2024 1

Hi Yizhong,

Thanks for your hints, and I have successfully performed 2 tests and received even better results of 48.5 (sample 8 times) and 55.1235 (sample 64 times) than 48.5 and 54.7 respectively reported from paper.

That's amazing!

Leo

from tk-instruct.

Yufang-Liu avatar Yufang-Liu commented on May 26, 2024

Hi everyone,

It's strange that I can only get 48 on 8 3090 GPUs without changing any parameters, does anyone know the possible reason?

from tk-instruct.

kuri-leo avatar kuri-leo commented on May 26, 2024

@Yufang-Liu

Hi Yufang,

Given the multitude of factors that could lead to variations in scores, it was on the batch_size my previous test. Further, I suggest examining if any optimization measures such as half-precision or ZERO have been auto-implemented via the Deepspeed or accelerate package.

Hope this may help.

Leo

from tk-instruct.

Yufang-Liu avatar Yufang-Liu commented on May 26, 2024

@Yufang-Liu

Hi Yufang,

Given the multitude of factors that could lead to variations in scores, it was on the batch_size my previous test. Further, I suggest examining if any optimization measures such as half-precision or ZERO have been auto-implemented via the Deepspeed or accelerate package.

Hope this may help.

Leo

Hi Leo, thanks a lot for your helpful suggestions !!

I found the reason is the version of the installed packages. I got the same results on 8 3090 GPUs with the same version of packages. Still not sure which package affects the performance.

from tk-instruct.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.