Git Product home page Git Product logo

wizardvicunalm's Introduction

KoVicuna icon

Update Logs


WizardVicunaLM

Wizard's dataset + ChatGPT's conversation extension + Vicuna's tuning method

I am a big fan of the ideas behind WizardLM and VicunaLM. I particularly like the idea of WizardLM handling the dataset itself more deeply and broadly, as well as VicunaLM overcoming the limitations of single-turn conversations by introducing multi-round conversations. As a result, I combined these two ideas to create WizardVicunaLM. This project is highly experimental and designed for proof of concept, not for actual usage.

Benchmark

Approximately 7% performance improvement over VicunaLM

Detail

gptordie

The questions presented here are not from rigorous tests, but rather, I asked a few questions and requested GPT-4 to score them. The models compared were ChatGPT 3.5, WizardVicunaLM, VicunaLM, and WizardLM, in that order.

gpt3.5 wizard-vicuna-13b vicuna-13b wizard-7b link
Q1 95 90 85 88 link
Q2 95 97 90 89 link
Q3 85 90 80 65 link
Q4 90 85 80 75 link
Q5 90 85 80 75 link
Q6 92 85 87 88 link
Q7 95 90 85 92 link
Q8 90 85 75 70 link
Q9 92 85 70 60 link
Q10 90 80 75 85 link
Q11 90 85 75 65 link
Q12 85 90 80 88 link
Q13 90 95 88 85 link
Q14 94 89 90 91 link
Q15 90 85 88 87 link
91 88 82 80

Principle

We adopted the approach of WizardLM, which is to extend a single problem more in-depth. However, instead of using individual instructions, we expanded it using Vicuna's conversation format and applied Vicuna's fine-tuning techniques.

Turning a single command into a rich conversation is what we've done here.

After creating the training data, I later trained it according to the Vicuna v1.1 training method.

Detailed Method

First, we explore and expand various areas in the same topic using the 7K conversations created by WizardLM. However, we made it in a continuous conversation format instead of the instruction format. That is, it starts with WizardLM's instruction, and then expands into various areas in one conversation using ChatGPT 3.5.

After that, we applied the following model using Vicuna's fine-tuning format.

Training Process

Trained with 8 A100 GPUs for 35 hours.

Weights

You can see the dataset we used for training and the 13b model in the Hugging Face.

Conclusion

If we extend the conversation to gpt4 32K, we can expect a dramatic improvement, as we can generate 8x more, more accurate and richer conversations.

prompt

This model was trained with Vicuna 1.1v, so it performs best when used as shown below.

USER: What is 4x8?
ASSISTANT:

Reactions

Reporting that works with AutoGPT

link

Report of improved abilities in not only Korean, but also Chinese and Japanese.

Although it was tuned 100% for English, it's curious how the language abilities for other countries, such as Korean, Chinese, and Japanese, have been enhanced even though their share should have decreased.
link1
link2

Report of enhanced coding skills.

link

Report of strengthened consistency during conversations.

link

Thanks to Prompt Engineering for the great video 🙏

Alt text

License

The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.

Author

JUNE LEE - He is active in Songdo Artificial Intelligence Study and GDG Songdo.

wizardvicunalm's People

Contributors

eltociear avatar melodysdreamj avatar morpheus2448 avatar thefaheem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wizardvicunalm's Issues

data processing

Can you share data processing code? How to convert instruction data into multi-turn dialogues? Should each instruction be converted into one turn? Thanks!

how much it cost if we use gtp4

Love the idea and the project you made

My question is how much will cost if we can use gpt4

Do you think is possible to have an idea of that cost?

So I can share the token but having a knowledge of how much it will cost

Evaluate the model with OpenAI Evals

After releasing GPT-4, OpenAI was met with a significant challenge: there weren't many benchmarks for LLMs focused on emergent capabilities like translation, reasoning, pattern identification, reasoning, etc. So they've created Evals, a coudsourced open-source set of benchmarks for LLMs. While somewhat OpenAI-centric, as the submission rules prohibit adding tests that GPT-4 can already consistently pass, it still remains a valuable tool for objective model evaluation.

If different open-access LLM projects can switch to a well-designed common benchmark, we may finally get to objectively compare our model quality, which I find essential for the future if local LLMs. For example, we may compare it against WizardLM, raw Vicuna, or GPT-3.5.

For reference on testing non OpenAI models with Evals, see OpenAssistant model evals.

How to infer?

Could you tell me how to used the model to infer?

Missing License File

It would be helpful if this could have a license inside the repository.
I'm assuming since this is a fine-tuning of a Llama-based model that this model should share the same GNU General Public License v3.0, but it would be helpful for all users to be certain of it.

Plans on bringing this to MosaicML?

Right now, it seems like MosaicML MPT has more potential than llama in the future. Do you have plans on tuning MPT models? (Also 7b models can run on phones, so I would really like to see a MPTWizardVicunaLM 7b)

data process pipeline

Amazing work!!! Some questions:

  1. How exactly did you combine the two datasets, can you share the code?
  2. Is the sharegpt dataset obtained from here ?

Got it

some items too big

The conversation that are longer than 2048 tokens gotta be split to multiple conversations.

I am seeing errors like this:

Token indices sequence length is longer than the specified maximum sequence length for this model (4209 > 2048). Running this sequence through the model will result in indexing errors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.