Git Product home page Git Product logo

Comments (15)

Atinoda avatar Atinoda commented on July 22, 2024

Glad to hear that you're diving into LoRA training! The monkey-patch technique is quite old but I've just checked upstream and I see that they're doing some updates. I will keep an eye on how things develop before I start making structural changes to this repo. What GPU and OS are you running?

For now, I suggest that you use the default variant, use the Transformers loader (that's just 'plain' huggingface models - not GGML or GPTQ), and check the load-in-4-bit option. I am able to train ehartford/WizardLM-7B-Uncensored 4-bit LoRAs on an RTX3090 using these options. I think this is what Ph0rk0z is referring to in the link that you sent with "You can train with qlora (full size files)..."

It seems that it also possible to train LoRAs on modern GPTQ models, but it is not something I am currently doing.

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

Your system is well aligned with the project! That's basically what I run as the main LLM rig too. I think that the Transformers loader method I described previously is using qlora technique (but I haven't taken the time to actually dig into the code - so i cannot say for sure). If the built-in trainer is sufficient then you do not need to run any code. If you're trying to modify textgen's training behaviour then you're in for a fair bit of development work!

Are you trying to achieve a longer context length?

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

Extended context is on my hit-list as well. Have you had any joy so far?

I still need to have a read through: https://github.com/epfml/landmark-attention which seemed promising.

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

Which guys cracked it - the ones you linked earlier? If there's off-the-shelf code to be integrated into textgen or deployable by Docker then that's exactly what I'll do!

I'm currently working on a large-ish project that integrates LLMs, and extended context is due for investigation a little bit later on. If I get to that point and there's nothing deployable then I'll have a go at coding a solution out of papers.

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

Thanks man, it's good to see other people's excitement and that you spotted that paper too! I'll need to sit down and read it properly soon, then go through their code and try running some implementations.

If you're using textgen as your platform, I do not think that you can run highly customised models and training regimes like that without, at the very least, a custom extension - but probably it will require a customised branch of the source code. That's not as bad as it sounds though, my production build uses a custom fork of textgen with some extra features that I need (and it runs using this repo's docker-compose.yml.build)

from text-generation-webui-docker.

 avatar commented on July 22, 2024

source TurboDerp and kaiokendev

ppl

And

# These two lines: self.scale = 1 / 4 t *= self.scale

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

I've flicked through that link and it does indeed look promising - I'm definitely saving that for later. Unfortunately the two lines of code described there are pretty low-level from my understanding (i.e., it's "only two bricks to move" but those two bricks are at the bottom of the wall on the ground floor of a skyscraper!). Nevertheless, if someone hasn't moved those bricks around in one of the major inference libraries by the time I'm looking at extended context, I will attempt it.

from text-generation-webui-docker.

 avatar commented on July 22, 2024

Actually, it was pretty easy, I managed to port it to Alpaca lora 4 bit but couldn't even get it to run anymore, with or without the updated monkeypatch.

Here a little summary. LLM's have been trained with tokens along side their positional embeddings. For llama, this is 2048. Now what if we decoupled the positional embedding with the amount of tokens fed in. We could extrapolate and run with 2049 and beyond but we've tried it and the line in yellow shows its limitations. This is where the realisation that the models have been always trained on their max limit comes in, with going beyond 2048 llama is like "huh, wtf is this" so what next? lets work within the 2048 with scaling. If we make it so every other token is 1/2 a positional token eg 0.5,1,1.5... then we essentially get 4096 and with 1/4, 8192 and so on. This was the 2 lines of code mentioned, just turning the scale into a fraction. Now when we train on it, we can get it to realise the in between positional embedding and use the max tokens 2048 as a float scale possibly reaching a whole sentence per 1 positional embedding. Neat huh?

from text-generation-webui-docker.

 avatar commented on July 22, 2024

Do you get what i mean now? They've cracked it with a really simple solution that anyone could play around with, even me :)

from text-generation-webui-docker.

 avatar commented on July 22, 2024

I managed to get a train running with 4096 context on alpaca lora 4bit :)

from text-generation-webui-docker.

 avatar commented on July 22, 2024

i wrote the docker compose and edited the docker file for it if you want them to run your own

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

Congratulations - that's an awesome achievement, and right at the cutting edge of open-source LLM deployments! If you're happy to share the files, then please do - I'll test and try to integrate them as a variant here.

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

Would you be happy to post the edited files here? Or fork the repo then link / PR your changes in a new branch? If we develop in a public forum then it adds visibility for other people, and opens the door for other collaborators. How do you think that your results compare to Mosaic's model?

from text-generation-webui-docker.

 avatar commented on July 22, 2024

I know, already created a repo and all, just wanted to have any form of pm. Ill add you to the repo tomorrow. Well MPT 7b Isn't as good as llama right off the bat and the long contexts, to which kaiokendev released this 30b 8k tokens and that performs great much better than MPT imo, even their 30B. The one thing I'll be excited for though is MPT 30B Storywriter.

from text-generation-webui-docker.

Atinoda avatar Atinoda commented on July 22, 2024

Stretched contexts through modified positional encoding is included in the ExLlama loader via the max_seq_len compress_pos_emb options. Check it out! I am going to close this issue as completed, because the motivation of the 4-bit training feature request was to replicate this functionality.

from text-generation-webui-docker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.