Git Product home page Git Product logo

Comments (3)

MingLiiii avatar MingLiiii commented on July 28, 2024 2

Hi, thank you for asking, but I doubt if I can fix your problem since the loss curves should be really task-specific, and I am really not an expert.

Anyways, you can send me an email if you want for more discussions or something like that.

from cherry_llm.

MingLiiii avatar MingLiiii commented on July 28, 2024

Hi, thank you very much for your interest in this work!

Firstly, I would like to declare that this problem does not come from our selected data but probably comes from the Stanford alpaca codebase. You can find our training losses of different models on our hugging face repo: https://huggingface.co/MingLiiii/cherry-alpaca-5-percent-7B/blob/main/trainer_state.json

Then, for this problem, I think directly downgrading the transformers into 4.28.1 will solve this problem: pip install transformers==4.28.1
and probably you need to re-install wandb:
pip install wandb

You can find similar problems here:
tatsu-lab/stanford_alpaca#298
tloen/alpaca-lora#418
tloen/alpaca-lora#170

Hope it works for you.
Please let me know if you have any other questions~

from cherry_llm.

ifshine avatar ifshine commented on July 28, 2024

As a beginner, when I see the loss curve becoming very strange, I feel at a loss.

Deeply thank you for your quick and detailed response. The loss curve is normal now.

(I encountered a very oscillatory loss curve while running code for some other projects. I wonder if it would be convenient for you to provide some debugging suggestions.)

from cherry_llm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.