Feature Request Are we going to support llama 3 model: <a href="ht

This fixes the issue for me with GGUFs: <a href="https://www.reddit.com/r/LocalLLa

fixed instruct model link : <a href="https://huggingface.co/bartowski/Meta-Llama-3-8B-

[Feature] Add llama 3 model about gpt4all HOT 7 CLOSED

qiweiii commented on June 23, 2024 9

[Feature] Add llama 3 model

from gpt4all.

Comments (7)

woheller69 commented on June 23, 2024 3

This fixes the issue for me with GGUFs:
https://www.reddit.com/r/LocalLLaMA/comments/1c7dkxh/tutorial_how_to_make_llama3instruct_ggufs_less/

Problem: Llama-3 uses 2 different stop tokens, but llama.cpp only has support for one. The instruct models seem to always generate a <|eot_id|> but the GGUF uses <|end_of_text|>.

Solution: Edit the GGUF file so it uses the correct stop token.

./gguf-py/scripts/gguf-set-metadata.py /path/to/llama-3.gguf tokenizer.ggml.eos_token_id 128009

from gpt4all.

dontcryme commented on June 23, 2024 3

fixed instruct model link : https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF
I tested above model and 100% working.

from gpt4all.

woheller69 commented on June 23, 2024 1

No, I think then you need to change the max number of tokens you want, or manually press stop

from gpt4all.

woheller69 commented on June 23, 2024

for me it works, but there is an issue: after the first answer the end of the answer does not seem to be detected, CPU stays at 100%...
Then a second question will not be answered. If you stop the model answer before the end it works.

from gpt4all.

davidsilvasmith commented on June 23, 2024

Yes very much hoping for Llama3 in GPT4all!

from gpt4all.

davidsilvasmith commented on June 23, 2024

Thank you, very helpful! I'm on an M1 Macbook Air with 16GB Ram.

I downloaded this model https://huggingface.co/lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/blob/main/Meta-Llama-3-8B-Instruct-Q5_K_M.gguf and put it in ~/Library/Application\ Support/nomic.ai/GPT4All/ directory and it's working great at about 4.1 - 4.4 tokens per second with my RAM full and about 7.5GB swap.

from gpt4all.

davidsilvasmith commented on June 23, 2024

Maybe I spoke too soon, in another test it kept talking and didn't stop until I told it to. So maybe I need to mess with the correct stop token too.

from gpt4all.

Recommend Projects

[Feature] Add llama 3 model about gpt4all HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent