Comments (2)
There is an ambiguity with the approach: if the model is in a subdirectory, e.g.:
~/.local/share/edgen/models/chat/completions/my-models/top-models/my-top-model.gguf
It could be a file or a model identifier with "my-models" as owner, "top-models" as repo and "my-top-model.gguf" as model. If the file exists, the huggingface API will be by-passed. Otherwise, edgen will try to download it from huggingface.
from edgen.
The model parameter in ai endpoint requests is now considered again. There are four valid cases:
- model contains a path
model-file
, such that the<config model dir> + "/" + <model-file>
exists, e.g.:
"model": "my-model.gguf" => ~/.local/share/edgen/models/chat/completions/my-model.gguf
In this case, the huggingface API is bypassed. If the file does not exist, the endpoint returns "No Such Model". - model contains a model in the format:
owner/repo/model
, e.g.:
"TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/tinyllama-1.1b-chat-v1.0.Q2_K.gguf" =>
~/.local/share/edgen/models/chat/completions/models--TheBloke--TinyLlama-1.1B-Chat-v1.0-GGUF
In this case, the huggingface API is used to identify the model and, if necessary, to download the model. - model contains "default" (case-insensitive): model as defined in the config is used. Whether the the huggingface API is used or not depends on what is configured there (a manually managed file or a huggingface-managed model).
- model contains nothing (e.g. "", " ", " ", etc.): like "default".
from edgen.
Related Issues (20)
- Tokio runtime panicking due to `llama_cpp::LlamaSession::context_size` using `block_on` HOT 2
- Add one shot LLM requests
- RAM and VRAM monitoring HOT 1
- limit on file size for audio transcription HOT 1
- feat: edgen needs to handle 1000s of requests HOT 1
- feat: auto-detect GPU and use it, if available
- feat: context sliding window
- chore: setup cargo deny in ci
- how do I build edgen locally in Mac HOT 18
- chore(docs): add embeddings in api reference
- audio/transcriptions may specify language HOT 1
- epic: candle integration - image generation HOT 1
- refactor: agnostic to ML backend HOT 5
- chore: add integration tests for embeddings
- feat: automatic api docs HOT 1
- chore: /embeddings/status endpoint
- chore: add embeddings to settings integration tests HOT 1
- Pass stop words directly to backend
- Refactor openai_shim.rs
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from edgen.