Comments (2)
this does't make sense to me as i have seen the opposite, cpu is super slow and only gpu can truly speed things up. llamaccp however is the faster cpu version but my query times talking to my own embedded data is still 1-3 mins per query
from llama-chat.
I have a question on memory (and I also observe very slow response using a GPU). On my Dell Precision 7780 workstation laptop with the 7B model I see the Nvidia GPU mem goes up to about 3GB out of the available 6GB. I assume this is due to 4-bit quantization? BUT, the CPU memory shoots up to almost 95% of my 32GB so it appears it is both shipping to GPU and CPU at the same time? Looking at the code I see in model.py these lines:
for layer in tqdm(self.layers, desc="flayers", leave=True):
if use_gpu:
move_parameters_to_gpu(layer)
h = layer(h, start_pos, freqs_cis, mask)
if use_gpu:
**move_parameters_to_cpu(layer)**
Why whould it move parameters to CPU in that if use_gpu statement?
from llama-chat.
Related Issues (20)
- "model parallel group is not initialized" when loading model HOT 2
- How to generate Bible data to LLAMA? HOT 4
- Do you have any plans to support GPTQ-4bit model? HOT 2
- How to off model training on the runtime?
- Incomplete answer with GPTSQLStructStoreIndex compared to ChainLang HOT 2
- Exception RuntimeError: at::cuda::blas::gemm: not implemented HOT 1
- Is example-chat.py ready to use GPU?
- Feature Request - Ways to add context to conversation from beginning. HOT 1
- Question - perform tasks HOT 1
- Running this llama-chat successfully, but with repetitive progress bars, is this normal?
- To shield the annoying progress bar, we found a way HOT 1
- hi, i need help
- The generation will stop at ":" keyword. HOT 1
- Chats are repeating
- Cuda Error on Training
- FineTuning the last layer of Model
- [Feature Request] Support InternLM
- Train model using GPU
- Can anyone suggest the best prompt for codellama13b model for converting the sql query to postgresql query
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama-chat.