After running the chat for a few dozen times through the loop with --max_new_tokens=50

Crash in hf_chat.py after lots of activity with longer output about llm-foundry HOT 4 CLOSED

mosaicml commented on May 16, 2024

Crash in hf_chat.py after lots of activity with longer output

from llm-foundry.

Comments (4)

alextrott16 commented on May 16, 2024

I believe this is due to the model having a default max_seq_len of 2048. Our mosaicml/mpt-7b-chat model should be able to extrapolate to longer sequences, but you have to set a different max_seq_len via the script arguments.
For example
python hf_chat.py -n mosaicml/mpt-7b-chat --max_seq_len 4096 ...

from llm-foundry.

patrickhwood commented on May 16, 2024

I think it'll eventually overflow unless the history is periodically pruned.

from llm-foundry.

alextrott16 commented on May 16, 2024

True. Thanks for the suggestion.

We would need to add logic for that into the model's source code on the HF hub. I don't think we can make that change via llm-foundry code, so I'll close this issue for now and see if automatic pruning is an option for later.

Feel free to add more comments if other issues arise and I'll reopen if necessary.

from llm-foundry.

patrickhwood commented on May 16, 2024

One possibility would be to prune the conversation in the ht_chat.py script to some number of previous Q/A pairs, e.g.,

 def conversation(model, tokenizer: Tokenizer, user_inp: str, history: str,
                  **generate_kwargs: Dict[str, Any]) -> Tuple[str, str, float]:
     if history != '':
+        if len(history.split("<|im_start|>")) > 12:
+            # note first element from split is the empty string
+            # so skip that and first user input and assistant response
+            newhistory = "<|im_start|>" + history.split("<|im_start|>")[1]
+            for y in history.split("<|im_start|>")[4:]:
+                newhistory += "<|im_start|>" + y
+            history = newhistory

(A little ugly, but it works for me.)

from llm-foundry.

Recommend Projects

Crash in hf_chat.py after lots of activity with longer output about llm-foundry HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent