Using the library for (vicuna) decoder only model <div class="snippet-clipboard-co

Good catch <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

The code I am using to load <div class="snippet-clipboard-content notranslate posi

Long prompt for decoder only model about inseq HOT 4 CLOSED

saxenarohit commented on June 14, 2024

Long prompt for decoder only model

from inseq.

Comments (4)

gsarti commented on June 14, 2024 1

Good catch @saxenarohit! That limit should not be necessary, I tried reproducing the issue with changes in #227 and now it should run correctly with prompts above 512 tokens!

Code:

import inseq
import torch
from transformers import AutoModelForCausalLM

model_name = "princeton-nlp/Sheared-LLaMA-1.3B"
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True)
inseq_model = inseq.load_model(model, "attention", tokenizer=model_name)#, tokenizer_kwargs={"legacy": False})
input_prompt = "This is a very long text"*100
output_text = inseq_model.generate(input_prompt,do_sample=False,max_length=2048,skip_special_tokens=True)
out = inseq_model.attribute(
    input_texts=input_prompt,
    generated_texts = input_prompt + output_text[0],
    step_scores=["probability"],
)
out.show()

If it still doesn't work for you, please feel free to reopen at any time!

from inseq.

gsarti commented on June 14, 2024

Hi @saxenarohit, thanks for the report. Could you provide me with the code you use to load the model (at the moment I cannot see which attribution method you attached) and an example input string causing the issue?

from inseq.

saxenarohit commented on June 14, 2024

The code I am using to load

inseq_model = inseq.load_model(model, "attention", tokenizer=tokenizer)
input_prompt = "This is a very long text"*100
input_tokens = tokenizer.encode_plus(input_prompt,return_tensors="pt", padding='max_length',max_length = len(input_prompt)).to("cuda")
output_text = inseq_model.generate(input_tokens,do_sample=False,max_length=2048,skip_special_tokens=True)
print(output_text)
out = inseq_model.attribute(
    input_texts=input_prompt,
    attribute_target=False,
    generated_texts = output_text,
    step_scores=["probability"],very_verbose=True).show()

This gives the error above even if I change model.config max_position_embeddings and max_sequence_length.
It generated the output_text but attribution fails.
Do I need to set the max_length in attribute argument?

If you reduce the length below 2048 it works fine.

Thanks for your prompt response.

from inseq.

saxenarohit commented on June 14, 2024

Hi, it seems the issue lies in this code in the file huggingface_model.py at line 255. It sets the max length to default 512 tokens.

inseq/inseq/models/huggingface_model.py

Line 255 in 53eedc8

if max_length > 1e6:

        # Cap length with max_model_input_sizes instead
        if max_length > 1e6:
            if hasattr(self.tokenizer, "max_model_input_sizes") and self.tokenizer.max_model_input_sizes:
                max_length = max(v for _, v in self.tokenizer.max_model_input_sizes.items())
            else:
                max_length = max_input_length

from inseq.

Long prompt for decoder only model about inseq HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent