shinomakoi / magi_llm_gui Goto Github PK

View Code? Open in Web Editor NEW

39.0 3.0 2.0 660 KB

A Qt GUI for large language models

License: Apache License 2.0

Python 99.65% CSS 0.35%

gui llama llm python llamacpp exllama qt

magi_llm_gui's Introduction

👋 Hi, I’m @shinomakoi
👀 I’m interested in AI
🌱 I’m currently learning Python
💞️ I’m looking to collaborate on AI projects
📫 How to reach me ...

magi_llm_gui's People

Contributors

Stargazers

Watchers

Forkers

brimming2020

magi_llm_gui's Issues

Errors crash app.

Anytime I run into any type of error/limit, the UI crashes. ie) 2049 tokenlimit exceeded. Have to restart the whole app.

--- Launched app
--- Loading Exllama model...
INFO: Could not find files for the given pattern(s).
Injected compiler path: C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\Hostx64\x64
--- Exllama model: E:\Backups\Deep Models\LLama\models\TheBloke_robin-33B-v2-GPTQ\robin-33b-GPTQ-4bit--1g.act.order.safetensors
--- Exllama model load parameters: {'model_path': 'E:/Backups/Deep Models/LLama/models/TheBloke_robin-33B-v2-GPTQ', 'gpu_split': False}
Traceback (most recent call last):
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 70, in run
backend_method()
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 82, in run_exllama
for response in responses:
File "C:\Users\xxxx\Deep\magi\exllama_generate.py", line 151, in generate_with_streaming
token = generator.gen_single_token()
File "C:\Users\xxxx\Deep\magi\exllama\generator.py", line 268, in gen_single_token
logits = self.model.forward(self.sequence[:, -1:], self.cache)
File "C:\Users\xxxx\Deep\magi\exllama\model.py", line 760, in forward
hidden_states = decoder_layer.forward(hidden_states, cache, buffers[device])
File "C:\Users\xxxx\Deep\magi\exllama\model.py", line 367, in forward
self.self_attn.fused(hidden_states, cache, buffer, self.input_layernorm)
File "C:\Users\xxxx\Deep\magi\exllama\model.py", line 254, in fused
key_states = cache.key_states[self.index].narrow(2, 0, past_len + q_len)
RuntimeError: start (0) + length (2049) exceeds dimension size (2048).

zyx

Does LlammaCPP use new KQuants and OpenCBlast?

New KoboldCPP has Cuda offloading through CLBlast... Would be awesome to have this hear as well and KQuant support :)

ErrorValueError: Found group index but no groupsize. What do?

Traceback (most recent call last):
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 786, in textgen_switcher
self.load_model('llama.cpp')
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 768, in load_model
cpp_model = LlamaCppModel.from_pretrained(
File "C:\Users\xxxx\Deep\magi\llamacpp_generate.py", line 12, in from_pretrained
return cls(**params)
File "C:\Users\xxxx\Deep\magi\llamacpp_generate.py", line 7, in init
self.model = Llama(**params)
File "C:\Users\xxxx\anaconda3\envs\magi\lib\site-packages\llama_cpp\llama.py", line 191, in init
raise ValueError(f"Model path does not exist: {model_path}")
ValueError: Model path does not exist:
--- Loading Exllama model...
INFO: Could not find files for the given pattern(s).
Injected compiler path: C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\Hostx64\x64
E:\Backups\Deep Models\LLama\models\TheBloke_airoboros-33b-gpt4-GPTQ\gptq_model-4bit--1g.safetensors
Exception ignored in: <function Llama.del at 0x0000021C1CA313F0>
Traceback (most recent call last):
File "C:\Users\xxxx\anaconda3\envs\magi\lib\site-packages\llama_cpp\llama.py", line 1333, in del
if self.ctx is not None:
AttributeError: 'Llama' object has no attribute 'ctx'
Traceback (most recent call last):
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 788, in textgen_switcher
self.load_model('Exllama')
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 760, in load_model
exllama_model = ExllamaModel.from_pretrained(
File "C:\Users\xxxx\Deep\magi\exllama_generate.py", line 64, in from_pretrained
model = ExLlama(config)
File "C:\Users\xxxx\Deep\magi\exllama\model.py", line 759, in init
layer = ExLlamaDecoderLayer(self.config, tensors, f"model.layers.{i}", i, sin, cos)
File "C:\Users\xxxx\Deep\magi\exllama\model.py", line 345, in init
self.self_attn = ExLlamaAttention(self.config, tensors, key + ".self_attn", sin, cos, self.index)
File "C:\Users\xxxx\Deep\magi\exllama\model.py", line 257, in init
self.q_proj = Ex4bitLinear(config, self.config.hidden_size, self.config.num_attention_heads * self.config.head_dim, False, tensors, key + ".q_proj")
File "C:\Users\xxxx\Deep\magi\exllama\model.py", line 165, in init
if self.groupsize is None: raise ValueError("Found group index but no groupsize. What do?")
ValueError: Found group index but no groupsize. What do?

Check this out

https://github.com/vllm-project/vllm

Continue button not working.

Exllama, loaded ThebLoke Chronos Hermes 13b

All is working, gen stopped. Continue button not working, had to type "continue" and press generate to finish.

Error: ModuleNotFoundError: No module named 'diskcache'

Built CPP with CUBLAS, tried to load model with 44 layers on a 4090

(magi) PS C:\Users\xxxx\Deep\magi> python .\magi_llm_app.py
--- Launched app
--- Loading llama.cpp model...
Traceback (most recent call last):
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 786, in textgen_switcher
self.load_model('llama.cpp')
File "C:\Users\xxxx\Deep\magi\magi_llm_app.py", line 766, in load_model
from llamacpp_generate import LlamaCppModel
File "C:\Users\xxxx\Deep\magi\llamacpp_generate.py", line 1, in
from llama_cpp import Llama
File "C:\Users\xxxx\anaconda3\envs\magi\lib\site-packages\llama_cpp_init_.py", line 2, in
from .llama import *
File "C:\Users\xxxx\anaconda3\envs\magi\lib\site-packages\llama_cpp\llama.py", line 21, in
import diskcache
ModuleNotFoundError: No module named 'diskcache'

Errors loading ggml models

Hey there been a while. Glad to see there was a recent update. Tried loading a different quant today and had an error. Not sure how to proceed. Also any way to incorporate AutoGPTQ?

--- Launched app
--- Set theme to: native
--- llama.cpp model load parameters: None
--- Loading llama.cpp model...
llama.cpp: loading model from E:/Backups/Deep Models/LLama/models/orca_mini_v3_70b.ggmlv3.q4_K_M.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 8192
llama_model_load_internal: n_mult = 7168
llama_model_load_internal: n_head = 64
llama_model_load_internal: n_layer = 80
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 15 (mostly Q4_K - Medium)
llama_model_load_internal: n_ff = 28672
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 65B
llama_model_load_internal: ggml ctx size = 0.18 MB
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
llama_init_from_file: failed to load model
--- Error loading backend:

---Error: Model load failure...

Minor issue: path syntax for activating venv on Windows

Your venv instructions for Windows are:
./.magi-venv/Scripts/activate ### For Windows

But the correct syntax is:
.\.magi_venv\Scripts\activate ## For Windows

Super small issue, but it will give an error otherwise:

./.magi-venv/Scripts/activate : The term './.magi-venv/Scripts/activate' is not recognized as the name of a cmdlet,
function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the
path is correct and try again.