Comments (5)
The readme indicates they are planning to release a technical report soon. I suspect the details will be in the tech report. I also hope they continue the training on the 3b and 7b model all the way up to 1.5T tokens!
THANK YOU STABILITY AI
Your contributions to the Open Source community are very much appreciated!
from stablelm.
More information will be published soon and is very likely to answer your questions when it's available.
from stablelm.
It would really be interesting and important to share some basic information about the training data already and in the repo though, especially the kind and size of training data for each language.
Judging from a few first trials, the amount of German training data is probably very small and thus results in German are quite poor.
from stablelm.
Can we expect that this forthcoming dataset declaration will include those inputs that imbue this model with politically correct output (even with a neutral SYSTEM prompt) ?
from stablelm.
Can we expect that this forthcoming dataset declaration will include those inputs that imbue this model with politically correct output (even with a neutral SYSTEM prompt) ?
Only the "Tuned" model has a SYSTEM prompt and that model's finetuning datasets are where that is coming from. They're the same finetuning data used for llama finetunes like Alpaca and GPT4All, which are outputs of ChatGPT. So ultimately they come from ChatGPT.
The "Base" model does not have a SYSTEM prompt and does not use those datasets or any like them.
from stablelm.
Related Issues (20)
- Any advice how to train a model of a different language? HOT 2
- What's the proper way to implement chatting feature? HOT 6
- RLHF training code for StableVicuna open sourced? HOT 1
- StableVicuna does not stop dialog speaking, probably until max_new_tokens. HOT 3
- loss not decreasing with deepspeed HOT 1
- Training Script stablity 3B and 7B HOT 6
- Unclear tokenizer class HOT 2
- Cannot run demo HOT 2
- fairyfloss HOT 2
- process killed HOT 4
- License unclear HOT 8
- Is it normal to take a long time ( about 15min )to generate an answer? HOT 1
- How to expand the sequence length of llama? HOT 1
- Consider using OpenAI Evals
- The output is the same as the input. HOT 1
- Is this project abandoned? HOT 4
- Stability AI
- Hello, how to convert the statityai/tablelm-base-alpha-3b to ggml format HOT 1
- Target modules ['query_key_value', 'dense', 'dense_h_to_4h', 'dense_4h_to_h'] not found in the base model. Please check the target modules and try again. HOT 2
- OSError: stabilityai/stablelm-base-alpha-3b-v2 does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stablelm.