Comments (4)
it also makes no sense for me, but i am no lawyer. but hey, i am happy with libre base model c: thank you stabilityAI
from stablelm.
(I am not a lawyer, this is not legal advice, consult a real lawyer before making decisions, this is just my personal thought)
Scale matters a lot when considering dataset usage rights. The base model is trained on a massive scale mix of content such that it doesn't really directly retain much content from any one individual source (ie in theory the license doesn't matter much). The finetune is directly on top of a small section of entirely-license-restricted content (ie the model will directly retain information from the licensed content, thus the license must be matched appropriately).
As another way of thinking about it: Imagine an artist/author/whatever human creative. If that person looks at some copyright worked and copies from it directly, they're violating that copyright. However that same person has also been through a lifetime of looking at copyrighted works that have undoubtedly influenced their creative thought, but when they sit down and make something original (a work derivative of the mix of ideas in their head, many of which originate from copyright-restricted works), their new work is not subject to prior copyrights, it is considered their own work.
from stablelm.
CC Non Commercial means it cannot be packaged in Debian, due to the non commercial restriction:
Could you re-release it under a copyleft license if you want users that modify it to republish their changes?
And what is the dataset used the training?
from stablelm.
@zoobab View the readme @ https://github.com/Stability-AI/StableLM#models for dataset info. More detail will be published soon.
I don't think a ten gig+ model file is fit to be packaged natively into Debian anyway? The actual relevant source code to run LLMs is separately maintained and separately licensed. It's just the models that have license info in this repository, and it's only the Instruct-finetune that's non-commercial, which has to be licensed that way due to the dataset used for the Instruct finetuning.
Future revisions of the instruct-finetune might use a different dataset and thus have a different license.
from stablelm.
Related Issues (20)
- loss not decreasing with deepspeed HOT 1
- Training Script stablity 3B and 7B HOT 6
- Unclear tokenizer class HOT 2
- Cannot run demo HOT 2
- fairyfloss HOT 2
- process killed HOT 4
- License unclear HOT 8
- Is it normal to take a long time ( about 15min )to generate an answer? HOT 1
- How to expand the sequence length of llama? HOT 1
- Consider using OpenAI Evals
- The output is the same as the input. HOT 1
- Is this project abandoned? HOT 4
- Stability AI
- Hello, how to convert the statityai/tablelm-base-alpha-3b to ggml format HOT 1
- Target modules ['query_key_value', 'dense', 'dense_h_to_4h', 'dense_4h_to_h'] not found in the base model. Please check the target modules and try again. HOT 2
- OSError: stabilityai/stablelm-base-alpha-3b-v2 does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack. HOT 3
- Windows fatal exception: access violation
- Chatting and prompt
- Big difference between the before-cooldown-ckpt and the final checkpoint in the results of downstream tasks?
- Can you share code/resources for Self Knowledge learning? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stablelm.