atinoda / text-generation-webui-docker Goto Github PK
View Code? Open in Web Editor NEWDocker variants of oobabooga's text-generation-webui, including pre-built images.
License: GNU Affero General Public License v3.0
Docker variants of oobabooga's text-generation-webui, including pre-built images.
License: GNU Affero General Public License v3.0
Using atinoda/text-generation-webui:llama-cpu-nightly, llama-cpu fails for another reason (likely missing GGUF support).
Loading ggml-model-f16.gguf…
2023-09-02 13:05:52 INFO:Loading ggml-model-f16.gguf...
text-generation-webui | 2023-09-02 13:05:52 INFO:llama.cpp weights detected: /models/ggml-model-f16.gguf
text-generation-webui | 2023-09-02 13:05:52 INFO:Cache capacity is 0 bytes
text-generation-webui | llama_model_loader: loaded meta data with 18 key-value pairs and 291 tensors from /models/ggml-model-f16.gguf (version GGUF V2 (latest))
text-generation-webui | llama_model_loader: - tensor 0: token_embd.weight f16 [ 4096, 32000, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 1: blk.0.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 2: blk.0.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 3: blk.0.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 4: blk.0.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 5: blk.0.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 6: blk.0.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 7: blk.0.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 8: blk.0.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 9: blk.0.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 10: blk.1.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 11: blk.1.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 12: blk.1.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 13: blk.1.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 14: blk.1.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 15: blk.1.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 16: blk.1.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 17: blk.1.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 18: blk.1.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 19: blk.10.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 20: blk.10.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 21: blk.10.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 22: blk.10.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 23: blk.10.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 24: blk.10.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 25: blk.10.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 26: blk.10.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 27: blk.10.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 28: blk.11.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 29: blk.11.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 30: blk.11.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 31: blk.11.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 32: blk.11.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 33: blk.11.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 34: blk.11.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 35: blk.11.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 36: blk.11.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 37: blk.12.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 38: blk.12.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 39: blk.12.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 40: blk.12.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 41: blk.12.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 42: blk.12.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 43: blk.12.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 44: blk.12.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 45: blk.12.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 46: blk.13.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 47: blk.13.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 48: blk.13.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 49: blk.13.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 50: blk.13.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 51: blk.13.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 52: blk.13.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 53: blk.13.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 54: blk.13.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 55: blk.14.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 56: blk.14.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 57: blk.14.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 58: blk.14.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 59: blk.14.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 60: blk.14.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 61: blk.14.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 62: blk.14.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 63: blk.14.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 64: blk.15.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 65: blk.15.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 66: blk.15.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 67: blk.15.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 68: blk.15.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 69: blk.15.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 70: blk.15.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 71: blk.15.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 72: blk.15.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 73: blk.16.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 74: blk.16.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 75: blk.16.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 76: blk.16.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 77: blk.16.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 78: blk.16.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 79: blk.16.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 80: blk.16.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 81: blk.16.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 82: blk.17.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 83: blk.17.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 84: blk.17.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 85: blk.17.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 86: blk.17.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 87: blk.17.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 88: blk.17.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 89: blk.17.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 90: blk.17.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 91: blk.18.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 92: blk.18.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 93: blk.18.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 94: blk.18.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 95: blk.18.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 96: blk.18.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 97: blk.18.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 98: blk.18.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 99: blk.18.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 100: blk.19.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 101: blk.19.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 102: blk.19.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 103: blk.19.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 104: blk.19.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 105: blk.19.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 106: blk.19.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 107: blk.19.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 108: blk.19.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 109: blk.2.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 110: blk.2.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 111: blk.2.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 112: blk.2.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 113: blk.2.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 114: blk.2.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 115: blk.2.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 116: blk.2.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 117: blk.2.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 118: blk.20.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 119: blk.20.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 120: blk.20.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 121: blk.20.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 122: blk.20.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 123: blk.20.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 124: blk.20.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 125: blk.20.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 126: blk.20.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 127: blk.21.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 128: blk.21.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 129: blk.21.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 130: blk.21.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 131: blk.21.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 132: blk.21.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 133: blk.21.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 134: blk.21.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 135: blk.21.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 136: blk.22.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 137: blk.22.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 138: blk.22.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 139: blk.22.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 140: blk.22.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 141: blk.22.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 142: blk.22.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 143: blk.22.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 144: blk.22.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 145: blk.23.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 146: blk.23.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 147: blk.23.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 148: blk.23.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 149: blk.23.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 150: blk.23.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 151: blk.23.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 152: blk.23.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 153: blk.23.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 154: blk.3.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 155: blk.3.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 156: blk.3.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 157: blk.3.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 158: blk.3.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 159: blk.3.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 160: blk.3.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 161: blk.3.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 162: blk.3.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 163: blk.4.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 164: blk.4.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 165: blk.4.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 166: blk.4.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 167: blk.4.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 168: blk.4.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 169: blk.4.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 170: blk.4.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 171: blk.4.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 172: blk.5.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 173: blk.5.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 174: blk.5.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 175: blk.5.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 176: blk.5.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 177: blk.5.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 178: blk.5.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 179: blk.5.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 180: blk.5.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 181: blk.6.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 182: blk.6.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 183: blk.6.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 184: blk.6.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 185: blk.6.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 186: blk.6.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 187: blk.6.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 188: blk.6.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 189: blk.6.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 190: blk.7.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 191: blk.7.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 192: blk.7.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 193: blk.7.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 194: blk.7.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 195: blk.7.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 196: blk.7.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 197: blk.7.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 198: blk.7.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 199: blk.8.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 200: blk.8.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 201: blk.8.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 202: blk.8.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 203: blk.8.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 204: blk.8.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 205: blk.8.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 206: blk.8.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 207: blk.8.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 208: blk.9.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 209: blk.9.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 210: blk.9.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 211: blk.9.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 212: blk.9.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 213: blk.9.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 214: blk.9.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 215: blk.9.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 216: blk.9.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 217: output.weight f16 [ 4096, 32000, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 218: blk.24.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 219: blk.24.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 220: blk.24.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 221: blk.24.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 222: blk.24.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 223: blk.24.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 224: blk.24.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 225: blk.24.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 226: blk.24.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 227: blk.25.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 228: blk.25.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 229: blk.25.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 230: blk.25.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 231: blk.25.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 232: blk.25.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 233: blk.25.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 234: blk.25.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 235: blk.25.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 236: blk.26.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 237: blk.26.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 238: blk.26.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 239: blk.26.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 240: blk.26.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 241: blk.26.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 242: blk.26.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 243: blk.26.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 244: blk.26.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 245: blk.27.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 246: blk.27.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 247: blk.27.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 248: blk.27.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 249: blk.27.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 250: blk.27.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 251: blk.27.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 252: blk.27.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 253: blk.27.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 254: blk.28.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 255: blk.28.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 256: blk.28.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 257: blk.28.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 258: blk.28.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 259: blk.28.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 260: blk.28.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 261: blk.28.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 262: blk.28.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 263: blk.29.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 264: blk.29.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 265: blk.29.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 266: blk.29.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 267: blk.29.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 268: blk.29.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 269: blk.29.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 270: blk.29.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 271: blk.29.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 272: blk.30.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 273: blk.30.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 274: blk.30.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 275: blk.30.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 276: blk.30.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 277: blk.30.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 278: blk.30.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 279: blk.30.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 280: blk.30.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 281: blk.31.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 282: blk.31.ffn_down.weight f16 [ 11008, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 283: blk.31.ffn_gate.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 284: blk.31.ffn_up.weight f16 [ 4096, 11008, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 285: blk.31.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 286: blk.31.attn_k.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 287: blk.31.attn_output.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 288: blk.31.attn_q.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 289: blk.31.attn_v.weight f16 [ 4096, 4096, 1, 1 ]
text-generation-webui | llama_model_loader: - tensor 290: output_norm.weight f32 [ 4096, 1, 1, 1 ]
text-generation-webui | llama_model_loader: - kv 0: general.architecture str
text-generation-webui | llama_model_loader: - kv 1: general.name str
text-generation-webui | llama_model_loader: - kv 2: llama.context_length u32
text-generation-webui | llama_model_loader: - kv 3: llama.embedding_length u32
text-generation-webui | llama_model_loader: - kv 4: llama.block_count u32
text-generation-webui | llama_model_loader: - kv 5: llama.feed_forward_length u32
text-generation-webui | llama_model_loader: - kv 6: llama.rope.dimension_count u32
text-generation-webui | llama_model_loader: - kv 7: llama.attention.head_count u32
text-generation-webui | llama_model_loader: - kv 8: llama.attention.head_count_kv u32
text-generation-webui | llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32
text-generation-webui | llama_model_loader: - kv 10: general.file_type u32
text-generation-webui | llama_model_loader: - kv 11: tokenizer.ggml.model str
text-generation-webui | llama_model_loader: - kv 12: tokenizer.ggml.tokens arr
text-generation-webui | llama_model_loader: - kv 13: tokenizer.ggml.scores arr
text-generation-webui | llama_model_loader: - kv 14: tokenizer.ggml.token_type arr
text-generation-webui | llama_model_loader: - kv 15: tokenizer.ggml.bos_token_id u32
text-generation-webui | llama_model_loader: - kv 16: tokenizer.ggml.eos_token_id u32
text-generation-webui | llama_model_loader: - kv 17: tokenizer.ggml.unknown_token_id u32
text-generation-webui | llama_model_loader: - type f32: 65 tensors
text-generation-webui | llama_model_loader: - type f16: 226 tensors
text-generation-webui | llm_load_print_meta: format = GGUF V2 (latest)
text-generation-webui | llm_load_print_meta: arch = llama
text-generation-webui | llm_load_print_meta: vocab type = SPM
text-generation-webui | llm_load_print_meta: n_vocab = 32000
text-generation-webui | llm_load_print_meta: n_merges = 0
text-generation-webui | llm_load_print_meta: n_ctx_train = 2048
text-generation-webui | llm_load_print_meta: n_ctx = 2048
text-generation-webui | llm_load_print_meta: n_embd = 4096
text-generation-webui | llm_load_print_meta: n_head = 32
text-generation-webui | llm_load_print_meta: n_head_kv = 32
text-generation-webui | llm_load_print_meta: n_layer = 32
text-generation-webui | llm_load_print_meta: n_rot = 128
text-generation-webui | llm_load_print_meta: n_gqa = 1
text-generation-webui | llm_load_print_meta: f_norm_eps = 1.0e-05
text-generation-webui | llm_load_print_meta: f_norm_rms_eps = 1.0e-05
text-generation-webui | llm_load_print_meta: n_ff = 11008
text-generation-webui | llm_load_print_meta: freq_base = 10000.0
text-generation-webui | llm_load_print_meta: freq_scale = 1
text-generation-webui | llm_load_print_meta: model type = 7B
text-generation-webui | llm_load_print_meta: model ftype = mostly F16
text-generation-webui | llm_load_print_meta: model size = 6.74 B
text-generation-webui | llm_load_print_meta: general.name = ..
text-generation-webui | llm_load_print_meta: BOS token = 1 '<s>'
text-generation-webui | llm_load_print_meta: EOS token = 2 '</s>'
text-generation-webui | llm_load_print_meta: UNK token = 0 '<unk>'
text-generation-webui | llm_load_print_meta: LF token = 13 '<0x0A>'
text-generation-webui | /scripts/docker-entrypoint.sh: line 69: 90 Illegal instruction (core dumped) "${LAUNCHER[@]}"
text-generation-webui exited with code 132
I am able to run the model just using llama-cpp, not sure what is going wrong here. Please let me know if you have any insights.
WARN[0000] /mnt/extradisk/text-generation-webui-docker/docker-compose.yml: `version` is obsolete
Attaching to text-generation-webui
Error response from daemon: failed to create endpoint text-generation-webui on network text-generation-webui-docker_default: failed to add the host (veth7a52283) <=> sandbox (vetheb0e327) pair interfaces: operation not supported
I have git pull and docker compose pull but the issue is still unsolved. Thanks for your help!
Currently able to install oobabot-plugin,but once installed it does not run due to not having the necessary module installed. Is there a way to specify additional modules to install on bootup?
is there a way to run with AMD GPU ?
scripts/build_extensions.sh has windows lf's in it which crashes during build (docker desktop windows) (syntax err)
run dos2unix on it or copy it into notepad++ with the LF's set to Unix and it'll sail through.
Thanks for all the work.
Steps to Reproduce:
Expected Behavior:
The character's YAML file should be successfully saved in the character directory. The character should be available for selection even after reloading.
Actual Behavior:
The character's YAML file is not created, and the character is not available for selection after reloading.
Hi, I noticed the removal of AutoGPTQ build in the latest. Is it because of this bug?AutoGPTQ/AutoGPTQ#128
Sounds like @PanQiWei is on it, so that's good, but my biggest question is whether this effectively removes the GPU support. I admit I'm still a bit new to this space, so please let me know if there is some alternative mechanism in place now for GPU support. I had been downloading .safetensors versions of models, but should I be looking for something different now?
Hey,
I was wanting to check if it is possible to run this container without a GPU?
Thanks,
I am currently running the container on unraid. I have used the docker compose file as well as maually creating the container and changing storage mounts. I am able to download the models from hf and when I select the GGUF model from the drop down it selects the llama.cpp transformer. I have tried many different variations of settings but no combination works. This is also true of ctransformers. As soon as I click load the container crashes with no logs. I am passing in my gtx 1070 with 8gb of VRAM and it is visible from within the container by running nvidia-smi. I have tried the DEFAULT, NVIDIA and even snapshots from 2023. I am not sure what I am doing wrong
Intel Arc GPUs have their own images now, according to the developments in the upstream project. I do not have the hardware to test them - so please give them a go! Reports are welcomed.
It looks like the default version (and cpu too) do not run on M1. I tried specifying - EXTRA_LAUNCH_ARGS="--listen --verbose --loader llama.cpp"
, that however did not have any effect.
It would be nice to have it fixed, since llama.cpp actually runs pretty well with smaller models on M1.
Any pointers?
[+] Building 0.0s (0/0)
[+] Running 2/0
✔ Container text-generation-webui Recreated0.1s
! text-generation-webui-docker The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested 0.0s
Attaching to text-generation-webui
text-generation-webui | === Running text-generation-webui variant: 'DEFAULT' snapshot-2023-10-15 ===
text-generation-webui | === (This version is 11 commits behind origin main) ===
text-generation-webui | === Image build date: 2023-10-18 11:30:52 ===
text-generation-webui | 2023-10-23 18:27:36 WARNING:
text-generation-webui | You are potentially exposing the web UI to the entire internet without any access password.
text-generation-webui | You can create one with the "--gradio-auth" flag like this:
text-generation-webui |
text-generation-webui | --gradio-auth username:password
text-generation-webui |
text-generation-webui | Make sure to replace username:password with your own.
text-generation-webui | /venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
text-generation-webui | warn("The installed version of bitsandbytes was compiled without GPU support. "
text-generation-webui | /venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
text-generation-webui | Traceback (most recent call last):
text-generation-webui | File "/app/server.py", line 31, in <module>
text-generation-webui | from modules import (
text-generation-webui | File "/app/modules/training.py", line 21, in <module>
text-generation-webui | from peft import (
text-generation-webui | File "/venv/lib/python3.10/site-packages/peft/__init__.py", line 22, in <module>
text-generation-webui | from .auto import (
text-generation-webui | File "/venv/lib/python3.10/site-packages/peft/auto.py", line 31, in <module>
text-generation-webui | from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING
text-generation-webui | File "/venv/lib/python3.10/site-packages/peft/mapping.py", line 23, in <module>
text-generation-webui | from .peft_model import (
text-generation-webui | File "/venv/lib/python3.10/site-packages/peft/peft_model.py", line 38, in <module>
text-generation-webui | from .tuners import (
text-generation-webui | File "/venv/lib/python3.10/site-packages/peft/tuners/__init__.py", line 21, in <module>
text-generation-webui | from .lora import LoraConfig, LoraModel
text-generation-webui | File "/venv/lib/python3.10/site-packages/peft/tuners/lora.py", line 45, in <module>
text-generation-webui | import bitsandbytes as bnb
text-generation-webui | File "/venv/lib/python3.10/site-packages/bitsandbytes/__init__.py", line 16, in <module>
text-generation-webui | from .nn import modules
text-generation-webui | File "/venv/lib/python3.10/site-packages/bitsandbytes/nn/__init__.py", line 6, in <module>
text-generation-webui | from .triton_based_modules import SwitchBackLinear, SwitchBackLinearGlobal, SwitchBackLinearVectorwise, StandardLinear
text-generation-webui | File "/venv/lib/python3.10/site-packages/bitsandbytes/nn/triton_based_modules.py", line 8, in <module>
text-generation-webui | from bitsandbytes.triton.dequantize_rowwise import dequantize_rowwise
text-generation-webui | File "/venv/lib/python3.10/site-packages/bitsandbytes/triton/dequantize_rowwise.py", line 10, in <module>
text-generation-webui | import triton
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/__init__.py", line 20, in <module>
text-generation-webui | from .compiler import compile, CompilationError
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/compiler/__init__.py", line 1, in <module>
text-generation-webui | from .compiler import CompiledKernel, compile, instance_descriptor
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 27, in <module>
text-generation-webui | from .code_generator import ast_to_ttir
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/compiler/code_generator.py", line 8, in <module>
text-generation-webui | from .. import language
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/language/__init__.py", line 4, in <module>
text-generation-webui | from . import math
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/language/math.py", line 4, in <module>
text-generation-webui | from . import core
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/language/core.py", line 1376, in <module>
text-generation-webui | def minimum(x, y):
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 542, in jit
text-generation-webui | return decorator(fn)
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 534, in decorator
text-generation-webui | return JITFunction(
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 433, in __init__
text-generation-webui | self.run = self._make_launcher()
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 388, in _make_launcher
text-generation-webui | scope = {"version_key": version_key(),
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/runtime/jit.py", line 120, in version_key
text-generation-webui | ptxas = path_to_ptxas()[0]
text-generation-webui | File "/venv/lib/python3.10/site-packages/triton/common/backend.py", line 114, in path_to_ptxas
text-generation-webui | result = subprocess.check_output([ptxas_bin, "--version"], stderr=subprocess.STDOUT)
text-generation-webui | File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
text-generation-webui | return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
text-generation-webui | File "/usr/lib/python3.10/subprocess.py", line 526, in run
text-generation-webui | raise CalledProcessError(retcode, process.args,
text-generation-webui | subprocess.CalledProcessError: Command '['/venv/lib/python3.10/site-packages/triton/common/../third_party/cuda/bin/ptxas', '--version']' died with <Signals.SIGTRAP: 5>.
text-generation-webui exited with code 1
find a way to make it persistent. e.g. apply a patch after u get https://github.com/oobabooga/text-generation-webui.git in dockerfile to change the default dir for settings.yaml to eg /extensions
it's absolutely annoying
heres the patch u have to use:
https://github.com/Gee1111/text-generation-webui/blob/main/make_settings_persistent.patch
edit send u a merge request but its better to put the make_settings_persistent.patch in your own repo and access it from there
Running in kubernetes, it works fine except the EXTRA_LAUNCH_ARGS are not honored as env vars.
The following I would expect to enable the api and preload a model, neither are enabled though.
- name: EXTRA_LAUNCH_ARGS
value: "--listen --verbose --api --extensions api --model TheBloke/vicuna-13B-v1.5-GPTQ --gpus all"
full kubernetes deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: text-gen-webui
namespace: text-gen-demo
spec:
replicas: 1
selector:
matchLabels:
component: text-gen-webui
template:
metadata:
labels:
component: text-gen-webui
spec:
tolerations:
- key: "nvidia.com/gpu"
value: present
effect: NoSchedule
containers:
- name: text-gen-demo-container
image: atinoda/text-generation-webui
ports:
- containerPort: 7860
- containerPort: 5000
- containerPort: 5005
resources:
limits:
nvidia.com/gpu: "1"
env:
- name: EXTRA_LAUNCH_ARGS
value: "--listen --verbose --api --extensions api --model TheBloke/vicuna-13B-v1.5-GPTQ --gpus all"
- name: TORCH_CUDA_ARCH_LIST
value: "7.5"
volumeMounts:
- name: text-gen-demo-pvc
mountPath: /app/loras
subPath: loras
- name: text-gen-demo-pvc
mountPath: /app/models
subPath: models
- name: shm
mountPath: /dev/shm
volumes:
- name: text-gen-demo-pvc
emptyDir: {}
- name: shm
emptyDir:
medium: Memory
sizeLimit: 1Gi
I used this docker-compose to try to run under CPU, but when loading model I get a CUDA version error:
text-generation-webui | === Running text-generation-webui variant: 'DEFAULT' === text-generation-webui | === (This version is 75 commits behind origin) === text-generation-webui | === Image build date: 2023-07-18 18:43:00 === text-generation-webui | /venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32 text-generation-webui | /venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. text-generation-webui | warn("The installed version of bitsandbytes was compiled without GPU support. " text-generation-webui | 2023-07-30 18:34:51 INFO:Loading ggml-model-q4_1.bin... text-generation-webui | CUDA error 35 at ggml-cuda.cu:2478: CUDA driver version is insufficient for CUDA runtime version text-generation-webui | /arrow/cpp/src/arrow/filesystem/s3fs.cc:2598: arrow::fs::FinalizeS3 was not called even though S3 was initialized. This could lead to a segmentation fault at exit text-generation-webui exited with code 1
Isn't this supposed to load without trying to use an Nvidia GPU? I have an AMD, I was trying to use the CPU instead but doesn't work...
I'm in Linux, do I need any extra steeps to do?.
I just try to start it as docker compose up
and faced the following error.
FileNotFoundError: [Errno 2] No such file or directory: 'presets/simple-1.yaml'
Any idea how to fix this?
Thanks
Amazing work! It was very easy to use! I followed all the instructions and successfully ran a container. I thought the folders inside the image would be mapped to volumes inside docker desktop, but they are linked to windows folders:
Why is it configured this way and not using the volumes that docker themselves recommend? Docker volumes increase portability and persist after updating the container just as well.
I am using Windows 11 home edition + WSL running a NVidea GTX 3060 and cuda GPU acceleration seems to work just fine.
I am looking for a working Docker configuration with an Nginx proxy? Did you get this working by chance?
no luck for me when trying to use this. Am I missing something? thanks
(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$ gs
++ pwd
+ current_dir=/home/dewi/code/text-generation-webui-docker/text-generation-webui-docker
+ [[ /home/dewi/code/text-generation-webui-docker/text-generation-webui-docker == \/\m\n\t\/\c* ]]
+ /usr/bin/git status -v -v
On branch master
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: docker-compose.yml
--------------------------------------------------
Changes not staged for commit:
diff --git i/docker-compose.yml w/docker-compose.yml
index d1caff0..33dda5e 100644
--- i/docker-compose.yml
+++ w/docker-compose.yml
@@ -1,7 +1,7 @@
version: "3"
services:
text-generation-webui-docker:
- image: atinoda/text-generation-webui:default # Specify variant as the :tag
+ image: atinoda/text-generation-webui:llama-cpu # Specify variant as the :tag
container_name: text-generation-webui
environment:
- EXTRA_LAUNCH_ARGS="--listen --verbose" # Custom launch args (e.g., --model MODEL_NAME)
no changes added to commit (use "git add" and/or "git commit -a")
git status
commit d4b58daffec5096e2a7057388420e74987537766 (HEAD -> master, origin/master, origin/HEAD)
Author: Atinoda <[email protected]>
Date: Wed Oct 18 15:49:48 2023 +0100
Separate nightly builds
(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$ docker compose up
Attaching to text-generation-webui
Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]
(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$
Can not seem to connect to webui through any means due to err connection refused, checked all firewall settings etc, appears to me that this may be the root of the issue:
https://pythonspeed.com/articles/docker-connection-refused/
As a newbie to all this, not been able to fix yet.
My spec:RTX 3060ti,R5 5600X,16GB ram
I want to load deepseek models I have tried deepseek 6.7B and 33B,and neither of these models work. I have run several 7B,14B,34B models and it work without any problems. When I want to load deepseek model, the webui just crashed immediately.
Here is the error output from terminal:
text-generation-webui | llama_model_loader: - type f32: 65 tensors
text-generation-webui | llama_model_loader: - type q6_K: 226 tensors
text-generation-webui | ERROR: byte not found in vocab: '
text-generation-webui | '
text-generation-webui | /scripts/docker-entrypoint.sh: line 69: 98 Segmentation fault (core dumped) "${LAUNCHER[@]}"
text-generation-webui exited with code 139
chj@archlinux /m/m/text-generation-webui-docker (master)>
Running the default build on Windows Docker with WSL2 Ubuntu 24.04. I'm unable to access the UI at http://127.0.0.1:7860
Log shows:
2024-06-06 13:34:17 === Running text-generation-webui variant: 'Nvidia Extended' abe5ddc8833206381c43b002e95788d4cca0893a ===
2024-06-06 13:34:18 === (This version is 0 commits behind origin main) ===
2024-06-06 13:34:18 === Image build date: 2024-06-04 21:34:25 ===
2024-06-06 13:34:21 20:34:21-479491 INFO Starting Text generation web UI
2024-06-06 13:34:22
2024-06-06 13:34:22 Running on local URL: http://127.0.0.1:7860
2024-06-06 13:34:22
Setting BUILD_EXTENSIONS_LIVE="coqui_tts" does not build them live and need to be enabled manually on the webui
When enabling coqui_tts in the webui it expects user input
text-generation-webui | 2023-12-15 22:33:07 INFO:Loading the extension "coqui_tts"...
text-generation-webui | [XTTS] Loading XTTS...
text-generation-webui | > You must agree to the terms of service to use this model.
text-generation-webui | | > Please see the terms of service at https://coqui.ai/cpml.txt
text-generation-webui | | > "I have read, understood and agreed to the Terms and Conditions." - [y/n]
text-generation-webui | 2023-12-15 22:33:15 ERROR:Failed to load the extension "coqui_tts".
text-generation-webui | Traceback (most recent call last):
text-generation-webui | File "/app/modules/extensions.py", line 41, in load_extensions
text-generation-webui | extension.setup()
text-generation-webui | File "/app/extensions/coqui_tts/script.py", line 180, in setup
text-generation-webui | model = load_model()
text-generation-webui | File "/app/extensions/coqui_tts/script.py", line 76, in load_model
text-generation-webui | model = TTS(params["model_name"]).to(params["device"])
text-generation-webui | File "/venv/lib/python3.10/site-packages/TTS/api.py", line 81, in __init__
text-generation-webui | self.load_tts_model_by_name(model_name, gpu)
text-generation-webui | File "/venv/lib/python3.10/site-packages/TTS/api.py", line 195, in load_tts_model_by_name
text-generation-webui | model_path, config_path, vocoder_path, vocoder_config_path, model_dir = self.download_model_by_name(
text-generation-webui | File "/venv/lib/python3.10/site-packages/TTS/api.py", line 149, in download_model_by_name
text-generation-webui | model_path, config_path, model_item = self.manager.download_model(model_name)
text-generation-webui | File "/venv/lib/python3.10/site-packages/TTS/utils/manage.py", line 433, in download_model
text-generation-webui | self.create_dir_and_download_model(model_name, model_item, output_path)
text-generation-webui | File "/venv/lib/python3.10/site-packages/TTS/utils/manage.py", line 359, in create_dir_and_download_model
text-generation-webui | if not self.ask_tos(output_path):
text-generation-webui | File "/venv/lib/python3.10/site-packages/TTS/utils/manage.py", line 338, in ask_tos
text-generation-webui | answer = input(" | | > ")
text-generation-webui | EOFError: EOF when reading a line
text-generation-webui | | | > Running on local URL: http://0.0.0.0:7860
text-generation-webui |
text-generation-webui | To create a public link, set `share=True` in `launch()`.
After doing some digging seems like a known problem that has already been fixed in oobabooga, refer to this issue
version: "3"
services:
text-generation-webui-docker:
image: atinoda/text-generation-webui:default-nightly # Specify variant as the :tag
container_name: text-generation-webui
environment:
- EXTRA_LAUNCH_ARGS="--listen --verbose" # Custom launch args (e.g., --model MODEL_NAME)
- BUILD_EXTENSIONS_LIVE="coqui_tts" # Install named extensions during every container launch. THIS WILL SIGNIFICANLTLY SLOW LAUNCH TIME.
ports:
- 7860:7860 # Default web port
- 5000:5000 # Default API port
- 5005:5005 # Default streaming port
- 5001:5001 # Default OpenAI API extension port
volumes:
- ./config/characters:/app/characters
- ./config/loras:/app/loras
- ./config/models:/app/models
- ./config/presets:/app/presets
- ./config/prompts:/app/prompts
- ./config/training:/app/training
- ./config/extensions:/app/extensions # Persist all extensions
# - ./config/extensions/silero_tts:/app/extensions/silero_tts # Persist a single extension
logging:
driver: json-file
options:
max-file: "3" # number of files or file count
max-size: '10m'
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
docker-compose up -d
Access the webui on port 7860 -> enable coqui_tts from the session tab
Run docker-compose logs
Is anyone able to get the alltalk_tts extension (https://github.com/erew123/alltalk_tts) to run properly in this container? I've done both the webui and standalone routes, but both throw errors. I'm running this container in Unraid, which otherwise has been fine. Coqui and silero tts extensions seems to work fine.
Install has been using the alltalk_tts github instructions (https://github.com/erew123/alltalk_tts?tab=readme-ov-file#-quick-setup-text-generation-webui--standalone-installation). Console into the TGW-docker container and an example workflow for running with TGW option being:
#Put alltalk_tts git repo into extensions folder
cd extensions && git clone https://github.com/erew123/alltalk_tts
#Install curl, as the start_linux.sh script needs it
apt install curl -y
#Run the webui start_linux.sh script to get the setup dependencies, like conda, installed, which the alltalk scripts need to work
cd .. && ./start_linux.sh
#select options Nvidia and N for old CUDA version
#Start the env, which alltalk_tts atsetup.sh needs
./cmd_linux.sh
#Make the alltalk config script executable and start execution
cd extensions/alltalk_tts && chmod +x ./atsetup.sh && bash ./atsetup.sh
#option 1 to start install
Alltalk diag log and TGW container log when attempting to use Alltalk extension attached. Thoughts?
It seems that an update is needed.
https://github.com/oobabooga/text-generation-webui/pull/1538/files
I am getting this error when composing my GPU is RTX 3070TI
Cannot start Docker Compose application. Reason: compose [start] exit status 1. Container text-generation-webui-text-generation-webui-1 Starting Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown
Warning: Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work
Followed by
OSError: [Errno 7] Argument list too long: 'ffprobe'
This occurs when using the API. Unsure if the interface side works, I've not sorted out the SSL issues yet; but I can't imagine it works any better.
Hi I'm trying to use one of the Mistral derived model (which works on the non-docker text-gen) but I keep getting this error message. Things seem to work with the llama model? Am I doing something wrong or is Mistral not supported on the default version? I tried with the Nightly version but it seems to fail as well.
text-generation-webui | To create a public link, set `share=True` in `launch()`.
text-generation-webui | 2023-10-12 13:35:19 INFO:Loading Mistral-7B-OpenOrca...
text-generation-webui | 2023-10-12 13:35:19 ERROR:Failed to load the model.
text-generation-webui | Traceback (most recent call last):
text-generation-webui | File "/app/modules/ui_model_menu.py", line 198, in load_model_wrapper
text-generation-webui | shared.model, shared.tokenizer = load_model(shared.model_name, loader)
text-generation-webui | File "/app/modules/models.py", line 78, in load_model
text-generation-webui | output = load_func_map[loader](model_name)
text-generation-webui | File "/app/modules/models.py", line 122, in huggingface_loader
text-generation-webui | config = AutoConfig.from_pretrained(path_to_model, trust_remote_code=params['trust_remote_code'])
text-generation-webui | File "/venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained
text-generation-webui | config_class = CONFIG_MAPPING[config_dict["model_type"]]
text-generation-webui | File "/venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 734, in __getitem__
text-generation-webui | raise KeyError(key)
text-generation-webui | KeyError: 'mistral'
text-generation-webui |
i, i am running 'Oobabooga LLM WebUI ', everything is fine, but now i am try to upload a training-dataset to /app/training/datasets.
But the gui does not found anything, also the /app/models/ is empty but a model is working.
Where i have to upload the training-dataset?
Thx
No build errors, however I noticed the container wasn't actually launching. Ran with logs and this is all it produced.
ubuntu:~/text-generation-webui-docker$ sudo docker-compose logs -f
Attaching to text-generation-webui
text-generation-webui | exec /scripts/docker-entrypoint.sh: exec format error
text-generation-webui exited with code 1
This happens for the two consequent nightly versions, and I have also built an image from the 2024-03-10 snapshot version:
https://github.com/oobabooga/text-generation-webui/releases/tag/snapshot-2024-03-10 . The issue happens both of them.
This is the base-nvidia version.
When I try to load an exllamav2 modell, I receive this error message:
File "/app/modules/ui_model_menu.py", line 245, in load_model_wrapper shared.model, shared.tokenizer = load_model(selected_model, loader) File "/app/modules/models.py", line 87, in load_model output = load_func_map[loader](model_name) File "/app/modules/models.py", line 378, in ExLlamav2_HF_loader from modules.exllamav2_hf import Exllamav2HF File "/app/modules/exllamav2_hf.py", line 7, in from exllamav2 import ( File "/venv/lib/python3.10/site-packages/exllamav2/init.py", line 3, in from exllamav2.model import ExLlamaV2 File "/venv/lib/python3.10/site-packages/exllamav2/model.py", line 23, in from exllamav2.config import ExLlamaV2Config File "/venv/lib/python3.10/site-packages/exllamav2/config.py", line 2, in from exllamav2.fasttensors import STFile File "/venv/lib/python3.10/site-packages/exllamav2/fasttensors.py", line 5, in from exllamav2.ext import exllamav2_ext as ext_c File "/venv/lib/python3.10/site-packages/exllamav2/ext.py", line 15, in import exllamav2_ext ImportError: /venv/lib/python3.10/site-packages/exllamav2_ext.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c107WarningC1ESt7variantIJNS0_11UserWarningENS0_18DeprecationWarningEEERKNS_14SourceLocationESsb
I built an image from the official repo as well, and that worked flowlessly.
I think the issue could be this step from the official repository:
conda install -y -c "nvidia/label/cuda-12.1.1" cuda-runtime
I couldn't find this step in the Dockerfile here.
Thanks for the help!
For atinoda/text-generation-webui:llama-cpu-nightly
:
⠹ text-generation-webui Pulling 1.2s
no matching manifest for linux/arm64/v8 in the manifest list entries
make: *** [up] Error 18
For reference, atinoda/text-generation-webui:llama-cpu
works without error.
was wanting to make sure your cool with using and forking your repo to make a unraid community app
nvidia-smi display the data shown below.
After a "docker-compose up" and some downloading the messages with the error 804 are there(see below).
I'm running this inside a host running debian 12.
Seems a nvidia/torch related issue, any pointers how to resolve it?
Thu Feb 1 12:42:13 2024 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | Off | | 0% 37C P8 12W / 450W | 6MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1137 G /usr/lib/xorg/Xorg 4MiB | +-----------------------------------------------------------------------------+
Recreating text-generation-webui ... done Attaching to text-generation-webui text-generation-webui | === Running text-generation-webui variant: 'DEFAULT' snapshot-2023-12-31 === text-generation-webui | === (This version is 11 commits behind origin main) === text-generation-webui | === Image build date: 2024-01-04 22:56:23 === text-generation-webui | 11:39:20-264526 INFO Starting Text generation web UI text-generation-webui | 11:39:20-266083 WARNING text-generation-webui | You are potentially exposing the web UI to the entire text-generation-webui | internet without any access password. text-generation-webui | You can create one with the "--gradio-auth" flag like text-generation-webui | this: text-generation-webui | text-generation-webui | --gradio-auth username:password text-generation-webui | text-generation-webui | Make sure to replace username:password with your own. text-generation-webui | 11:39:20-266993 INFO Loading the extension "gallery" text-generation-webui | ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ text-generation-webui | │ /app/server.py:254 in │ text-generation-webui | │ │ text-generation-webui | │ 253 # Launch the web UI │ text-generation-webui | │ ❱ 254 create_interface() │ text-generation-webui | │ 255 while True: │ text-generation-webui | │ │ text-generation-webui | │ /app/server.py:133 in create_interface │ text-generation-webui | │ │ text-generation-webui | │ 132 ui_parameters.create_ui(shared.settings['preset']) # Paramete │ text-generation-webui | │ ❱ 133 ui_model_menu.create_ui() # Model tab │ text-generation-webui | │ 134 training.create_ui() # Training tab │ text-generation-webui | │ │ text-generation-webui | │ /app/modules/ui_model_menu.py:36 in create_ui │ text-generation-webui | │ │ text-generation-webui | │ 35 for i in range(torch.cuda.device_count()): │ text-generation-webui | │ ❱ 36 total_mem.append(math.floor(torch.cuda.get_device_properti │ text-generation-webui | │ 37 │ text-generation-webui | │ │ text-generation-webui | │ /venv/lib/python3.10/site-packages/torch/cuda/__init__.py:449 in │ text-generation-webui | │ get_device_properties │ text-generation-webui | │ │ text-generation-webui | │ 448 """ │ text-generation-webui | │ ❱ 449 _lazy_init() # will define _get_device_properties │ text-generation-webui | │ 450 device = _get_device_index(device, optional=True) │ text-generation-webui | │ │ text-generation-webui | │ /venv/lib/python3.10/site-packages/torch/cuda/__init__.py:298 in _lazy_init │ text-generation-webui | │ │ text-generation-webui | │ 297 os.environ["CUDA_MODULE_LOADING"] = "LAZY" │ text-generation-webui | │ ❱ 298 torch._C._cuda_init() │ text-generation-webui | │ 299 # Some of the queued calls may reentrantly call _lazy_init(); │ text-generation-webui | ╰──────────────────────────────────────────────────────────────────────────────╯ text-generation-webui | RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda text-generation-webui | functions before calling NumCudaDevices() that might have already set an error? text-generation-webui | Error 804: forward compatibility was attempted on non supported HW text-generation-webui exited with code 1
Would be very helpful with ability to use extensions.
Hello!
I just stumbled upon this and think this is a great addition to the quick deployment capabilities for text-generation-ui.
Looking through the code updates it looks we're about 2 weeks behind, but also looking at the image tags, we track a default stable build and a nightly build.
I'm wondering why there aren't any older builds? normally on docker-hub you can revert back if necessary, but i don't see any of the older images. will this be an option in the future?
Thank you!
Hi,
Firstly, thanks for maintaining the docker container!
Just want to know if exllamav2 is integrated in the docker container version?
I installed exllamav2 with pip install via container console and install was success. But when trying to load: https://huggingface.co/turboderp/CodeLlama-13B-instruct-2.65bpw-h6-exl2 or any other exl2 model it fails, see logs below:
To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
File "/app/modules/ui_model_menu.py", line 196, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File "/app/modules/models.py", line 79, in load_model
output = load_func_map[loader](model_name)
File "/app/modules/models.py", line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File "/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 516, in from_pretrained
return model_class.from_pretrained(
File "/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2650, in from_pretrained
raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models/turboderp_Llama2-13B-4.0bpw-h6-exl2.
EXTRA_LAUNCH_ARGS only accepts the fist Argument and ignores the others:
i need e.g. : --listen --trust-remote-code --gradio-auth user:mypw
so my EXTRA_LAUNCH_ARGS value is : --listen --trust-remote-code --gradio-auth user:mypw
fixed it by set in "..."
I run the container by docker compose up successfully, and then set the model in webui(http://localhost:7860), i can download the models sucessfully: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/tree/main, there is TheBloke_Llama-2-7B-Chat-GGML/ dir in models dir.
But when i choose the model and click the load button, it has this error:
text-generation-webui | 08:57:39-112718 INFO Loading "TheBloke_Llama-2-7B-Chat-GGML"
text-generation-webui | 08:57:39-152392 ERROR Failed to load the model.
text-generation-webui | Traceback (most recent call last):
text-generation-webui | File "/app/modules/ui_model_menu.py", line 242, in load_model_wrapper
text-generation-webui | shared.model, shared.tokenizer = load_model(selected_model, loader)
text-generation-webui | File "/app/modules/models.py", line 87, in load_model
text-generation-webui | output = load_func_maploader
text-generation-webui | File "/app/modules/models.py", line 247, in llamacpp_loader
text-generation-webui | model_file = list(Path(f'{shared.args.model_dir}/{model_name}').glob('*.gguf'))[0]
text-generation-webui | IndexError: list index out of range
I also try to modify the docker-compose.yml,i add the "--model TheBloke_Llama-2-7B-Chat-GGML" in - EXTRA_LAUNCH_ARGS="--listen --verbose" # Custom launch args (e.g., --model MODEL_NAME), also the same error .
how can i resolve it?
Hi, How do I train a LORA with a GPTQ 4-bit model as its base? I tried with and without the new monkeypatch and with all the different model loaders. I got the furthest by trying to use AutoGPTQ without monkeypatch, and got this response:
285, in _create_new_module raise ValueError( ValueError: Target module QuantLinear() is not supported. Currently, only torch.nn.Linear and Conv1D are supported.
just like this issue
Any chance you could incorporate Ooba's new exllama support? It's more than just an upstream code update. Needs additional cloning of that repo, etc. https://github.com/oobabooga/text-generation-webui/blob/main/docs/ExLlama.md
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.