Hi there! First of all, thank you for the amazing work! The readme says the mo

Dataset used to pre-train about stablelm HOT 5 OPEN

stability-ai commented on July 21, 2024 2

Dataset used to pre-train

from stablelm.

Comments (5)

gururise commented on July 21, 2024 11

The readme indicates they are planning to release a technical report soon. I suspect the details will be in the tech report. I also hope they continue the training on the 3b and 7b model all the way up to 1.5T tokens!

THANK YOU STABILITY AI
Your contributions to the Open Source community are very much appreciated!

from stablelm.

mcmonkey4eva commented on July 21, 2024 2

More information will be published soon and is very likely to answer your questions when it's available.

from stablelm.

johann-petrak commented on July 21, 2024 1

It would really be interesting and important to share some basic information about the training data already and in the repo though, especially the kind and size of training data for each language.
Judging from a few first trials, the amount of German training data is probably very small and thus results in German are quite poor.

from stablelm.

fche commented on July 21, 2024

Can we expect that this forthcoming dataset declaration will include those inputs that imbue this model with politically correct output (even with a neutral SYSTEM prompt) ?

from stablelm.

MarkSchmidty commented on July 21, 2024

Can we expect that this forthcoming dataset declaration will include those inputs that imbue this model with politically correct output (even with a neutral SYSTEM prompt) ?

Only the "Tuned" model has a SYSTEM prompt and that model's finetuning datasets are where that is coming from. They're the same finetuning data used for llama finetunes like Alpaca and GPT4All, which are outputs of ChatGPT. So ultimately they come from ChatGPT.

The "Base" model does not have a SYSTEM prompt and does not use those datasets or any like them.

from stablelm.

Recommend Projects

Dataset used to pre-train about stablelm HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent