Was wondering how I could pass the arguments --instruct and --model to the npm start c

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Running in instruct mode and model file in a different directory about gpt-llama.cpp HOT 5 OPEN

keldenl commented on July 16, 2024

Running in instruct mode and model file in a different directory

from gpt-llama.cpp.

Comments (5)

keldenl commented on July 16, 2024

instruct isn't a valid flag because it's encompassed in the api itself – ChatCompletion will simulate a chat response and Completion will simulate a completion specifically. So it's not a necessary flag (the app using the OpenAi API should already be doing the right "instruct" mode when necessary)

For the model, you want to pass that into the gpt-app instead (like chatbot-ui or auto-gpt), typically in the .env file, so it'd look something like OPENAI_API_KEY=../llama.cpp/models/wizardLM-7B-GGML/wizardLM-7B.ggml.q5_1.bin

from gpt-llama.cpp.

lee-b commented on July 16, 2024

That would be weird abuse of a variable. It would be much better to have a LOCAL_MODEL_PATH variable, and if no local model path is set, then use OpenAI's API, for example. I would favor trying to use a de facto standard local API such as text-generation-webui's API, rather than trying to reinvent the wheel by running local models directly, though. For one thing, sharing one local API means that multiple tools can use it. For another, there's a LOT of complexity in supporting local acceleration hardware and different model types and so on. Just using a standard local API makes it a lot simpler.

from gpt-llama.cpp.

regstuff commented on July 16, 2024

@keldenl
Sorry I think I'm missing something. How do I get it to follow the ### INSTRUCTION ### RESPONSE template for alpaca and similar models. When I use chatcompletion, it seems to be in a User: Assistant: template, which isn't working for wizardLM. The LLM doesn't follow my instructions.
When I use the Completions endpoint and add the Instruction Response template into the prompt, the server seems to hang and no response is generated.
It Processes the prompt, and then the ===== RESPONSE ===== line appears, and that's it.

from gpt-llama.cpp.

keldenl commented on July 16, 2024

That would be weird abuse of a variable. It would be much better to have a LOCAL_MODEL_PATH variable, and if no local model path is set, then use OpenAI's API, for example. I would favor trying to use a de facto standard local API such as text-generation-webui's API, rather than trying to reinvent the wheel by running local models directly, though. For one thing, sharing one local API means that multiple tools can use it. For another, there's a LOT of complexity in supporting local acceleration hardware and different model types and so on. Just using a standard local API makes it a lot simpler.

The thing about this is that the end goal for this project to be able to plug 'n play with any GPT-powered project – the less changes (even 0 changes like in chatbot-ui) to the code the better. LOCAL_MODEL_PATH is something people need to account for (i.e. langchain supporting local models), but this project aims to solve for all the other GPT apps that exist out there how can we leverage the work folks have done but run a local model against it? That's the goal.

from gpt-llama.cpp.

keldenl commented on July 16, 2024

@keldenl
Sorry I think I'm missing something. How do I get it to follow the ### INSTRUCTION ### RESPONSE template for alpaca and similar models. When I use chatcompletion, it seems to be in a User: Assistant: template, which isn't working for wizardLM. The LLM doesn't follow my instructions.
When I use the Completions endpoint and add the Instruction Response template into the prompt, the server seems to hang and no response is generated.
It Processes the prompt, and then the ===== RESPONSE ===== line appears, and that's it.

@regstuff it sounds like you might be running into a different issue – any chance you could post what's showing up on your terminal and what the request is? (where are you using the server? chatbot-ui?)

also i just merged some changes that should give u better error logging so maybe pull and then post here?

from gpt-llama.cpp.

Running in instruct mode and model file in a different directory about gpt-llama.cpp HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent