Git Product home page Git Product logo

Comments (4)

waleed177 avatar waleed177 commented on July 21, 2024

it also makes no sense for me, but i am no lawyer. but hey, i am happy with libre base model c: thank you stabilityAI

from stablelm.

mcmonkey4eva avatar mcmonkey4eva commented on July 21, 2024

(I am not a lawyer, this is not legal advice, consult a real lawyer before making decisions, this is just my personal thought)

Scale matters a lot when considering dataset usage rights. The base model is trained on a massive scale mix of content such that it doesn't really directly retain much content from any one individual source (ie in theory the license doesn't matter much). The finetune is directly on top of a small section of entirely-license-restricted content (ie the model will directly retain information from the licensed content, thus the license must be matched appropriately).

As another way of thinking about it: Imagine an artist/author/whatever human creative. If that person looks at some copyright worked and copies from it directly, they're violating that copyright. However that same person has also been through a lifetime of looking at copyrighted works that have undoubtedly influenced their creative thought, but when they sit down and make something original (a work derivative of the mix of ideas in their head, many of which originate from copyright-restricted works), their new work is not subject to prior copyrights, it is considered their own work.

from stablelm.

zoobab avatar zoobab commented on July 21, 2024

CC Non Commercial means it cannot be packaged in Debian, due to the non commercial restriction:

celery/celery#2890

Could you re-release it under a copyleft license if you want users that modify it to republish their changes?

And what is the dataset used the training?

from stablelm.

mcmonkey4eva avatar mcmonkey4eva commented on July 21, 2024

@zoobab View the readme @ https://github.com/Stability-AI/StableLM#models for dataset info. More detail will be published soon.

I don't think a ten gig+ model file is fit to be packaged natively into Debian anyway? The actual relevant source code to run LLMs is separately maintained and separately licensed. It's just the models that have license info in this repository, and it's only the Instruct-finetune that's non-commercial, which has to be licensed that way due to the dataset used for the Instruct finetuning.

Future revisions of the instruct-finetune might use a different dataset and thus have a different license.

from stablelm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.