Git Product home page Git Product logo

jpllm's Issues

Clean slurm scripts

Make slurm scripts consistent in time, memory requested, etc. 30 minutes is shown to be a reasonable time for a few messages to the model. 190GB a reasonable CPU limit. 80GB a reasonable GPU limit.

Investigate and implement appropriate hyperparameters (temp, etc).

In web mistral interfaces, the model code-switches to a limited, but consistent, degree. In my implementation, I am seeing a lack of code-switching. I likely have temp to a default of 1, making the model deterministic with a default to English. A preliminary attempt to set temp to some value between 0 and 1 resulted in an error thrown.

Prompt Mixtral w/ human speech data

Because we need to conform to user, assistant response templates, we can alternate each portion of dialogue. Whenever the speaker switches, switch the user to assistant and vice versa. Steps:

Clean corpus data into chat template
Provide chat template to model
Collect and Save output

Parallel Computations

Eventually scale to large instruction datasets, needs parallel computations for maximum efficiency. (ex: split 1000 prompts into sets of 50, 100, 250, 500, depending on cluster availability. Test experimentally).

Implement lang ID

When model with code-switching behavior (at least some code-switching as in the web interface) is implemented, add the component of lang id via the BERT model. Test on the miami corpus, or the LinCE framework.

Finish prompt transcription

From the linked prompting data in the Southeast Asia code-mixing study, transcribe and modify it to an English/Spanish context.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.